A Graph-Based Firebase

fhur · on Aug 25, 2022

Very interesting. I came to many similar conclusions completely independently, even attempted to build a typesafe in memory datalog in typescript.

I also came to the conclusion that just exposing datalog triples as a query language would never feel right and tried to expose a graphql like language that generated the datalog triples.

IMO react relay offers a great similar offering with their normalized cache. Relay has great DX too and can be totally type safe. To my knowledge datalog is way too dynamic for static analysis.

That being said, I would love to try Instant out. I'm really happy to see innovation in this area.

stopachka · on Aug 26, 2022

Thank you for the kind words! Both Joe and I are around if you any feedback. In terms of type-safety, we're thinking of it as an added layer. I'm optimistic, that since InstaQL provides objects as the interface, we could have more idiomatic types down the road.

eurasiantiger · on Aug 26, 2022

This comment thread looks like it could benefit from Dgraph, which is an open source GraphQL-layer database built on top of a RDF n-quad store (badger).

habosa · on Aug 26, 2022

I used to work on Firebase. I also use Firebase heavily in my work and personal projects.

I am normally quite skeptical of “Firebase but X” projects because they often seem to be running head first into the issues that the Firebase team so skillfully avoided.

This isn’t one of those projects. The author of this essay clearly gets what you need to actually build a backend for a modern app that’s also simple enough to get started with. I’m very excited about this project.

stopachka · on Aug 26, 2022

Thank you for the kind words.

pezo1919 · on Aug 26, 2022

Are transactions supported in that solution or in the mentioned Datalog related ecosystem? I have built my own similar reactive inmemory triple store with typescripts type safety, but then I realized I still need transactions which are a bit of a pain, because transactions are bundling the otherwise separete triplets, so the atomic, independent logic of triplets and related effects breaks a bit. (I am sure it's solvable.)

jitl · on Aug 26, 2022

The example code uses a `transact` function. But it really depends on what you mean by “transaction”. We don’t use a triple store at Notion, but we do use an abstraction that ensures a collection of operations either all succeed or all fail. We don’t support “interactive” transaction, where you can read, modify, write as an atomic group. This just isn’t desirable in a multiplayer or offline system - in cases we need that kind of consistency we use a normal HTTP API which is online-only.

pezo1919 · on Aug 26, 2022

Cool, thanks! Now a question: how a dev picks up related knowledge? I have a BSc and extra 7 years in development, but this area was totally gray to me. My motivation was only to come up a scalable solution for offline first apps with some kind of automatic persistence support, so at the end of the day my design goals were quite similar. How do you come up with stuff at notion? Are there must have books or just going with gut, experience and existing solutions you are aware about?

stopachka · on Aug 26, 2022

As jitl points out, in multiplayer settings, what you need is some way to commit a series of transactions all-together or not at all. We support this.

In the case where you want to read the database inside your transaction, we take inspiration from Datomic. Datomic runs all mutations in one high-memory box. You can provide functions that run in that box. This way, you can guarantee that the reads inside your transaction have the latest value. There's a lot of UX to figure out there, and this would be something to try to avoid in an offline-available setting.

pezo1919 · on Aug 26, 2022

Yes exactly this UX issue bothers me a lot, I wouldn't even go there. :D

What about migrations? Do you support? That's another thing I need in my offline first project, one of my other project has died because the lack of it. (I need something which plays well with Expo.io)

stopachka · on Aug 26, 2022

Migrations are very tough when you go offline-first. Cambria [^1] is an interesting read. For Instant, we are schemaless and think about offline more like a cache. In our case it's less of a problem.

[^1]: https://www.inkandswitch.com/cambria/

pezo1919 · on Aug 26, 2022

Thank you! Having one of my projects died to it (lack of migrations, offline-first) I had promised myself I'll first solve that problem before writing a meaningful line of business code in a future project. :)

tommiegannert · on Aug 27, 2022

I'm working on building a database in the same space as InstantDB. Currently, it's an "object/graph database using Protobuf". There's a check to ensure updated Protobuf definitions are backwards-compatible. Of course, this still implicitly relies on using Protobufs correctly (i.e. a missing value is the same as zero/empty/null/nil), even though I'm trying to make it safe by default.

I'm curious what your needs are. Would you mind elaborating on what kind of migrations your project would have needed to not die?

pezo1919 · on Aug 26, 2022

Oh another thing: do you guys have twitter or something? :) Would love to follow the project and devs too.

stopachka · on Aug 26, 2022

We don't have a business twitter, but I'm @stopachka, and my cofounder is @JoeAverbukh. Thank you for the kind words :)

pezo1919 · on Aug 26, 2022

Oh I just realized I have been already following you. :) Is that possible that you've been using cyclejs/xstream at some point? :) Or maybe from Future of Programming slack (or how it is called. :))

stopachka · on Aug 26, 2022

I haven't checked these out, but I'm definitely intrigued. Peaked at cyclejs -- I'm a fan of FRP, and more recently structured concurrency.

pezo1919 · on Aug 26, 2022

Isn't FRP just naturally superior? :)

j-pb · on Aug 26, 2022

The basic intuitions they have are correct, but if feels like there is a serious lack of or disregard for the theoretical background needed to talk about this stuff properly.

> (pull db '[* {:team/task [* {:task/owner [*]}]}] team-id)

That's not Datalog. It's kind of a mix between a conjunctive query and a regular path query.

Datalog isn't even really a query language. It's a class of query languages with a specific expressive power.

Whereas SQL is essentially conjunctive queries with non-stratified Negation, Datalog is recursive conjunctive queries with no or only stratified negation.

Datalog also has nothing to do with triples, they are two completely orthogonal concepts.

I've been slaving away in this very space for years and this post heavily reminds my of my initial hubris. Building something truly foundational that can be universally implemented and perform in all potential target languages from Javascript, to WASM via e.g. Zig and Rust, while at the same time being both conceptually simple, and easy to implement, is really really really difficult with a lot of pain lurking in the details.

blain_the_train · on Aug 26, 2022

do you write about your findings and ideas? i would be interested to read more.

j-pb · on Aug 26, 2022

I regularly regret not writing a blog...

But if you want to talk about some ideas feel free to drop by at discord.gg/tribles

kevmo314 · on Aug 25, 2022

> These triples say that the Layer with id 1 has a fontSize 20 and backgroundColor blue. Since they are different rows, there’s no conflict.

This sounds a lot like Bigtable (https://cloud.google.com/bigtable), which also does last-write-wins conflict resolution layer. So this is adding a GraphQL + frontend layer to it?

stopachka · on Aug 26, 2022

I'm not as familiar with Bigtable's data model. Afaik they don't use a triple-store like system, but their data model does look interesting. [^1]

The concept you have is on point though. You can think that we've moved a graph-database over to the frontend, and introduces a GraphQL-like language for it.

[^1]: https://static.googleusercontent.com/media/research.google.c...

kevmo314 · on Aug 26, 2022

Very cool, super exciting to see a clean frontend to it. As a big Firebase fan, I'm looking forward to what you build :)

jinjin2 · on Aug 26, 2022

Another product that does all this (relations, reactivity, offline, permissions, sync, etc…) really well, is Realm.

It is beyond awesome for building mobile apps, but mobile only. Annoyingly they have never released a web version. No idea why.

I would still look into it for ideas. It is super powerful and a joy to work with. I would love to see something like that working in the browser.

dustingetz · on Aug 26, 2022

One possible solution to the "sync" layer (how do you sync the local db shard with the global consensus) looks like this: https://www.researchgate.net/publication/359578461_Continuou... (2022)

also need to fit whole view updates into 16ms frame budget (not just one query but every query on the page that is impacted by a change as well as downstream reactive views). At some tipping point it can be faster to move relational queries to the cloud (sacrificing local first), and treat local first as an edge case (not all page components need to be live in offline mode - depending on the app, you may just need document edits not relations)

stopachka · on Aug 26, 2022

The research paper looks intriguing. I'll look into it. I am not as convinced about moving queries to the cloud, when it comes to the north stars of Figma / Linear / Notion. It would be hard to get the same kind of UX.

tlarkworthy · on Aug 26, 2022

At Firebase we sometimes pondered the feasibility of a SQL version, but the semantics of SQL seem littered with semantic footguns that don't lend themselves to offline, secure, scalable and event driven distributed applications. We know everybody wanted better query expressivity but delivering that in a mobile friendly, clienside, secure package was very difficult to see the path to.

tripletstore/datalog actually seems like a decent compromise between SQL and no-SQL that could actually work out! Awesome idea!

stopachka · on Aug 26, 2022

I'm encouraged reading this. Thank you.

joe_fishfish · on Aug 26, 2022

Does Instant have any plans for iOS and Android SDKs? One of the biggest wins with Firebase is that it's really easy to integrate across platforms.

stopachka · on Aug 26, 2022

Yes! Right now one of our users implemented an RN integration. We'd love to support iOS and Android soon.

ar7hur · on Aug 26, 2022

I would really use this for many of my projects! For project A, I used Firebase which was great initially but eventually ran into the limitations well described in Stopa's essay. For project B, I decided to write everything from scratch but the amount of work it required is crazy, and of course the result in quite brittle.

stopachka · on Aug 26, 2022

Thrilled to hear this, thank you :).

fernandohur · on Aug 25, 2022

Another project with similar goals in mind: https://paulbutler.org/2020/the-webassembly-app-gap/

Also backed by YC.

paulgb · on Aug 26, 2022

Thanks for the mention!

For some context, that blog post discusses what I saw as missing pieces of the stack circa 2020. The startup I’m building (https://driftingin.space) is roughly the “lambda for websockets” part, whereas Instant is closer to the “generalized CRDT data layer part”.

jitl · on Aug 26, 2022

Is this blog post a company? What company?

dang · on Aug 26, 2022

https://news.ycombinator.com/item?id=30502978

mizzao · on Aug 26, 2022

This looks like an evolution of Meteor, doesn't it?

stopachka · on Aug 26, 2022

Meteor certainly inspires us!

eyeswideopen · on Aug 26, 2022

What would you say are your biggest differences?

stopachka · on Aug 26, 2022

Relations, and less vertical integration

antidnan · on Aug 26, 2022

Very cool! What're the differences between this and something like Replicache?

stopachka · on Aug 26, 2022

I haven't looked too deeply at Replicache, but the two differences as I understand it:

1. Replicache is a layer you add over an existing backend. Instant handles the backend. 2. The expose a key-value store. We expose a graph store.

Both 1&2 have pros and cons each way. I think we're inspired by the same problems, and their docs are really well done.

waveywaves · on Aug 26, 2022

Are there any plans to open source instant in the future ?

stopachka · on Aug 26, 2022

We like the open-core mode. We have definite plans to open-source the client. Still think the best way to do this for the server.

revskill · on Aug 25, 2022

I use Hasura for this purpose. With some hooks, you can achive offline mode, too.

You need a bit tricky hacks to use Hasura permission system the way you want, though.

stopachka · on Aug 26, 2022

Thanks for taking a look at the essay. As you mentioned, indeed it's possible if you store all commands, and sync with IndexedDB. The edge cases get quite complicated though. We're optimistic that a local layer that handles transactions and sync could makes things a lot easier.

alexashka · on Aug 25, 2022

Hasura does not offer syncing.

Syncing is hard :)

revskill · on Aug 25, 2022

Aha, yup. My way of offline, is just store all commands offline and sync the command only. Syncing the view is hard.

lf-non · on Aug 26, 2022

We used to do this for an app, but moved away from that eventually, because:

1. You need to support old command (and payload schema) indefinitely. Or need migrations for those instead.

2. In highly interactive apps, this needs a lot of chatter. A peer that is offline for a few weeks may need to sync many many commands before they get to the latest version, and people generally don't like waiting on sync to complete. And syncing while users are interacting with the app gets complex pretty fast.

We tried to alleviate 2 using things like compaction and merging of commands, but eventually it ended up being much more complex than syncing the latest ui state which was a much smaller payload for our case.

It also eliminated many of corner case bugs in our command history compaction logic where some clients could end up in an invalid or unexpected states in very specific scenarios which were very hard to reproduce. Doing aggressive compaction while retaining effective order can get tricky if the object model is complex and deeply interlinked.

Not saying it is not a valid approach, but Ymmv.

kennymeyers · on Aug 25, 2022

Tell me more about this!