Datomic's is perfect for probably 90% of small-ish backoffice systems that never...

electroly · on April 27, 2023

> Someone needs to have the working set of your database in memory, why shouldn't it be your app server and business logic?

This one confused me. The obvious reason why you don't want the whole working set of your database in the app server's memory is because you have lots of app servers, whereas you only have one database[1]. This suggests that you put the working set of the database in the database, so that you still only need the one copy, not in the app servers where you'd need N copies of it.

The rest of your post makes sense to me but the thing about keeping the database's working set in your app server's memory does not. That's something we specifically work to avoid.

[1] Still talking about "non-webscale" office usage here, that's the world I live in as well. One big central database server, lots of apps and app servers strewn about.

dimitar · on April 27, 2023

Consider this use case - in addition to your web app, you have a reporting service that makes heavy duty reports; if you run one at a bad time, bad things might happen like users not being able to log in or do any other important work, because the database is busy with the reports.

So in a traditional DB you might have a DBA set up a reporting database so the operational one is not affected. Using Datomic the reporting service gets a datomic peer that has a copy of the DB in database without any extra DBA work and without affecting any web services. This also works nicely with batch jobs or in any situation where you don't want to have different services affect each others performance.

Its true that a lot more memory gets used, but it is relatively cheap - usually the biggest cost when hosting in the cloud being the vCPUs. But usually in Clojure/Datomic web application you don't need to put various cache services like Redis in front of your DB.

Thea assumption here is that the usual bottleneck for most information systems and business applications is reading and querying data.

electroly · on April 27, 2023

I appreciated this insight into other people's use cases, thank you for that! This architecture brings RethinkDB to mind, which also had some ability to run your client as a cluster node that you alone get to query. (Although there it was more about receiving the live feed than about caching a local working set.)

svieira · on April 27, 2023

> which also had some ability to run your client as a cluster node that you alone get to query

FoundationDB does this as well.

bsaul · on April 28, 2023

Do you have a pointer to some doc explaining how to do that in foundationdb ?

svieira · on April 28, 2023

https://blog.the-pans.com/notes-on-the-foundationdb-paper/ is one:

> Client (notice not Proxy) caches uncommitted writes to support read-uncommitted-writes in the same transaction. This type of read-repair is only feasible for a simple k/v data model. Anything slightly more complicated, e.g. a graph data model, would introduce a significant amount of complexity. Caching is done on the client, so read queries can bypass the entire transaction system. Reads can be served either locally from client cache or from storage nodes.

thewataccount · on April 27, 2023

Is RethinkDB still around?

They actually have recent commits, and a release last year.

electroly · on April 27, 2023

The company is gone, but the open source project lives on. We still use it in production.

thewataccount · on April 27, 2023

How's it been for production?

Would you recommend using it, or would it be better to go with a safer option?

jwr · on April 28, 2023

RethinkDB user here. I've been running it in production for the last 8 years or so. It works. It doesn't lose data. It doesn't corrupt data (like most distributed databases do, read the Jepsen reports).

I am worried about it being unmaintained. I do have some issues that are more smells than anything else — like things becoming slower after an uptime of around three weeks (I now reboot my systems every 14 days). I could also do with improved performance.

I'm disappointed that the Winds of Internet Fashion haven't been kind to RethinkDB. It was always a much better proposition that, say, MongoDB, but got less publicity and sort of became marginalized. I guess correct and working are not that important.

I'm slowly rebuilding my application to use FoundationDB. This lets me implement changefeeds, is a correct distributed database with fantastic guarantees (you get strict serializability in a fully distributed db!), and lets me avoid the unneeded serialization to/from JSON as a transport format.

electroly · on April 27, 2023

We've never had any issue with it on a typical three-node install in Kubernetes. It requires essentially no ongoing management. That said, it can't be ignored that the company went under and now it's in community maintenance mode. If you don't have a specific good use for Rethink's changefeed functionality, which sets it apart from the alternatives, I'm not sure I could recommend it for new development. We've chosen not to rip it out, but we're not developing new things on it.

thewataccount · on April 27, 2023

Interesting thank you!

I remember back when it came out it was a big deal that it could easily scale master-master nodes and the object format was a big thing because of Mongo back then.

That was before k8's wasn't a thing back then, and most of the failover for other databases wasn't a thing just yet. I'm too scared to use it because they have a community but they're obviously nowhere as active as the Postgres and other communities.

tyre · on April 27, 2023

With AWS you can create a replica with a few clicks. Latency is measured in milliseconds for the three-nines observed usage.

I don’t think this is a huge challenge (anymore) for Postgres or whatever traditional database.

dimitar · on April 27, 2023

That is a read replica, datomic peers can do writes as well, which further expands the possible use cases.

augustl · on April 27, 2023

A few milliseconds per query adds up if you're doing one query per item in a list with 100s or 1000s of elements :)

tyre · on April 28, 2023

ah okay. in that case add in a couple hours to refactor those N+1 queries and you’re all good

robertlagrant · on April 28, 2023

> A few milliseconds per query adds up if you're doing one query per item in a list with 100s or 1000s of elements :)

If you're doing this, then you need to stop :)

bt1a · on April 28, 2023

when do you ever have to do one query per array of items? genuinely curious

augustl · on April 28, 2023

I suppose I think of this the other way around. When the query engine is inside your app, the query engine doesn't need to do loop-like things. So you can have a much simpler querying language and mix it with plain code, kind of similar to how you don't need a templating language in React (and in Clojure when you use hiccup to represent HTML)

Additionally, this laziness means your business logic can dynamically choose which data to pull in based on results of queries, and you end up with running fewer queries as you'll never need to pull in extra data in one huge query in case business logic needs it.

augustl · on April 27, 2023

It's definitely a trade-off! If you have 10s or 100s of app servers that has the exact same working set in memory, it's probably not worth it.

But if you have a handful of app servers, it's much more reasonable. The relatively low scale back-office systems I tend to work with typically has 2, max 3. Also, spinning up an extra instance that does some data crunching does not affect the performance of the app servers, as they don't have to coordinate.

There's also the performance and practicality benefits you get from not having to do round-trips in order to query. You can now actually do 100 queries in a loop, instead of having to formulate everything as a single query.

And if you have many different apps that operates on the same DB, it becomes a benefit as well. The app server will only have the _actual_ working set it queries on in memory, not the sum of the working set across all of the app servers.

If this becomes a problem, you can always architecture your way around it as well, by having two beefy API app servers that your 10s or 100s of other app servers talks to.

electroly · on April 27, 2023

SQLite provides a similar benefit with tremendous results using its in-process database engine, although the benefit there is more muted by default because of the very small default cache size. We do have one app where we do this. There's no database server, the app server uses SQLite to talk directly to S3 and the app server itself caches its working set in memory. I can definitely see the benefit of some situations, but for us it was a pretty unusual situation that we might not ever encounter again.

All that said... can't Datomic also do traditional query execution on the server? I thought it had support for scale-out compute for that. AIUI, you have the option to run as either a full peer or just an RPC client on a case-by-case basis? I thought you wouldn't need to resort to writing your own API intermediary, you could just connect to Datomic via RPC, right?

carry_bit · on April 27, 2023

AIUI, the full peer is Datomic; the RPC server is just a full peer that exposes the API over http and is mainly intended to be used with clients that don't run on the JVM (and so can't run Datomic itself in-process).

xmlblog · on April 27, 2023

> If you have 10s or 100s of app servers that has the exact same working set in memory, it's probably not worth it.

The introduction of intelligent application-level partitioning [1] and routing schemes can help one balance cost and performance.

[1] https://blog.datomic.com/2023/04/implicit-partitions.html

NovemberWhiskey · on April 27, 2023

I think the point is that treating your database as an arms-length, RPC component that's independent from your "application" isn't necessarily the only pattern.

foobiekr · on April 27, 2023

Strong agree. there are vast, massive cost savings and performance advantages to be had if the model is that a shard of the dataset is in memory and the data persistence problem is the part that's made external. The only reason we are where we are today is that doing that well is hard.

robertlagrant · on April 28, 2023

Is this not the case already? Database drivers (or just your application code) are allowed to cache results if they like. The problem is cache invalidation.

foobiekr · on April 28, 2023

Caches aren't the same. In the shard-in-memory case, the shard is the thing serving the queries, meaning it's not a cache it _is_ the live data.

robertlagrant · on April 28, 2023

Understood. For a single-node or read-only system it sounds fine, but then there are a variety of ways to solve that (e.g. a preloaded in-memory sqlite).

bombolo · on April 28, 2023

If it's in memory you must live with the fact that it might be gone at any moment though.

foobiekr · on April 28, 2023

Hence "the data persistence part."

robertlagrant · on April 28, 2023

This sounds like a write-back / write-through cache with extra terminology. What's the difference?

foobiekr · on April 28, 2023

The difference is the 10,000x slower performance if you move the actual DB transaction leader for the shard to over-the-wire access.

Anything can be anywhere if we ignore latency and throughput.

robertlagrant · on May 2, 2023

I'm not saying that. I'm asking if this is just new terminology for a previous technology idea, or whether it's a new concept.

xmlblog · on April 27, 2023

Having the working set present on app servers means they don't put load on a precious centralized resource which becomes a bottleneck for reads. The peer model allows app servers to service reads directly, avoiding the cost of contention and an additional network hop, allowing for massive read scale.

bruiseralmighty · on April 27, 2023

This is true, but the tradeoff is that now your central DB is a bottleneck that is difficult to scale.

Having the applications keep a cached version of the db means that when one of them runs a complex or resource intensive query, it's not affecting everyone else.

epolanski · on April 27, 2023

> Datomic's is perfect for probably 90% of small-ish backoffice systems that never has to be web scale (i.e. most of what I do at work).

So is any cloud-managed db offering and at that scale we talking very small costs anyway.

Why datomic instead?

augustl · on April 27, 2023

Because of the reasons I list :) Anything in particular that wasn't clear/relevant?

xpe · on April 27, 2023

> Datomic's is perfect for probably 90% of small-ish backoffice systems that never has to be web scale (i.e. most of what I do at work).

I don’t think I agree with this as stated. It is too squishy and subjective to say “perfect”.

More broadly, the above is not and should not be a cognitive “anchor point” for reasonable use cases for Datomic. Making that kind of claim requires a lot more analysis and persuasion.

augustl · on April 27, 2023

I agree, I mostly phrased it that way for effect. My "analysis" is 100% subjective, opinionated and anecdotal :)

fulafel · on April 27, 2023

> Someone needs to have the working set of your database in memory, why shouldn't it be your app server and business logic?

This is Ions in the Cloud version, or for on-prem version the in-process peer library.

Lutger · on May 1, 2023

Datomic always seemed like a really cool thing to use. However, I'm not familiar with Clojure or any other JVM based language, nor do I have the time to learn it. And I can't find any supported way to use it with other languages (I'm not even talking about popular frameworks), or am I missing something?

It doesn't feel like the people behind Datomic actually want to have users outside of the Clojure world, which will be rather limiting to adoption.

brundolf · on April 27, 2023

Something I've been curious about: how well (or badly) would it scale to do something similar on a normal relational DB (say, Postgres)?

You could have one or more append-only tables that store events/transactions/whatever you want to call them, and then materialized-views (or whatever) which gather that history into a "current state" of "entities", as needed

If eventual-consistency is acceptable, it seems like you could aggressively cache and/or distribute reads. Maybe you could even do clever stuff like recomputing state only from the last event you had, instead of from scratch every time

How bad of an idea is this?

augustl · on April 27, 2023

Datomic already sort of does this :) You configure a storage backend (Datomic does not write to disk directly) which can be dynamodb, riak, or any JDBC database including postgres. You won't get readable data in PG though, as Datomic stores opaque compressed chunks in a key/value structure. The chunks are adressable via the small handful of built-in indexes that Datomic provides for querying, and the indexes are covering, i.e. data is duplicated for each index.

brundolf · on April 27, 2023

Interesting! I assumed Datomic was entirely custom

Now I'm even more curious if you could skip Datomic and just do something like this directly with a relational DB in production

JBiserkov · on April 27, 2023

But why?! The whole point of Datomic is that it implements this entire immutable framework for you, on top of mutable storage.

So YOU can focus on building your own specific business logic, instead of re-implementing the immutable DB wheel.

xmcqdpt2 · on April 28, 2023

Because, for example, your application is not tied to the JVM? You are uncomfortable using closed source software for such a critical piece of infra? As far as I can tell they don't even have a searchable bug report database! I'd hate to be the one debugging an issue involving datomic.

kamma4434 · on April 27, 2023

Yes, but you end up rewriting Datomic!

brundolf · on April 27, 2023

Well, except it sounds like Datomic is closed-source :)

Moru · on April 28, 2023

Nah, our (not very good) implementation existed 10 years before Datomic :-)

koreth1 · on April 27, 2023

That's a pretty common pattern in event-sourcing architectures. It is a completely viable way to do things as long as "eventual-consistency is acceptable" is actually true.

Scarbutt · on April 27, 2023

Datomic's is perfect for probably 90% of small-ish backoffice systems that never has to be web scale

How do they scale it for Nubank? (millions of users)

augustl · on April 27, 2023

Good question! I don't have any personal experience in that regard. I would probably have paid up for enterprise support (or bought the entire company ;))

ithrow · on April 27, 2023

I don't how they do it, but the obvious answer is probably sharding. Their cloud costs must be no joke. Peers require tons of memory and I can only guess they must have thousands of transactors to support that workload and who knows how many peers. Add to this that they probably need something like Kafka for integrating/pipelining all this data.

outworlder · on April 27, 2023

> Peers require tons of memory

As do most distributed databases. Even when you don't store your entire database (or working set) in memory, you'll likely still have to add quite a bit of memory to be used as I/O cache.

bhurlow · on April 29, 2023

sharding, microservices, they run many instances of Datomic to handle different functionality

spariev · on April 27, 2023

One thing which is quite hard to do in Datomic is simple pagination on a large sorted dataset, as one can easily do with LIMIT/OFFSET in MySQL for example. There are solutions for some of the cases, but general case is not solved, as far as I remember (it’s been a while I used it extensively)

augustl · on April 27, 2023

It depends! If you want to lazily walk data, you can read directly from the index (keep in mind, the index = the data = lives in your app), or use the index-pull API which is a bit more convenient.

However, if you want to paginate data that you need to sort first, and the data isn't sorted the way you want in the index, you have to read all of the data first, and then sort it. But this is also what a database server would need to do :)

spariev · on April 28, 2023

Yep, I am well aware of these specifics and workarounds, but in general case where is no general solution to the question asked here, for example [0]. And for big datasets with complex sorting it will take some effort to implement a seemingly simple feature.

Guess it is just one of the tradeoffs, as while some features Datomic has out of the box are hard to replicate in RDBMS-es, things like pagination which are often took for granted is a bit of work to do in Datomic. So it is something to keep in mind when considering the switch

[0] https://forum.datomic.com/t/idiomatic-pagination-using-lates...

augustl · on April 28, 2023

Interesting link, thanks for posting!

Datomic's covering indexes are heavily based on their built-in ordering, and doesn't really have much flexibility in different ways to sort and walk data.

Personally, I'm a fan of hybrid database approaches. In the world of serverless, I really enjoy the combo of DynamoDB and Elasticsearch, for example, where Dynamo handles everything that's performance critical, and Elasticsearch handles everything where dynamic queries (and ordering, and aggregation, and ....) is required. I've never done this with Datomic, but I'd imagine mirroring the "current" value of entities without historical data is relatively easy to set up.

convolvatron · on April 27, 2023

there must be a relational sort? ah yes, there is, and the ability to feed a relational output into a Clojure sort

pachico · on April 27, 2023

You seem to describe the Event Source paradigm rather than a database :)

augustl · on April 27, 2023

The main difference between event sourcing and datomic are the indexes and the "schema", which provides full SQL-like relational query powers out of the box, as well as point-in-time snapshots for every "event" (transactions of facts).

So, "events" in Datomic are structured and Datomic uses them to give you query powers, they're not opaque blobs of data.

JimmyRuska · on April 27, 2023

> doing UPDATE in SQL feels pretty weird, as the default mode of operation of your _business data_ is to delete data, with no trace of who and why!

It's a good idea to version your schema changes using something like liquibase into git, that gets rid of at least some of those pains. Liquibase works on a wide variety of databases, even graphs like Neo4j

I got the same feeling in Erlang many times, once write operations start getting parallel you worry about atomic operations, and making an Erlang process centralize writes through its message queue always feels natural and easy to reason about.