You can't just say "immutable log" and then be done. You certainly don't want to...

dustingetz · on Nov 10, 2017

In RDBMS, when you shard, read shards and write shards are in lock-step, which is the whole problem with sharding. In Datomic (and in git), by sharding writes, it doesn't really impact reads.

This is interesting, because consider a large system like Facebook. Transactions naturally fall within certain boundaries. You never transact to Events, Photos, and Instagram all at once - from the write side, they don't have to share the same single-writer process delivering ACID.

You do however, on the read side, need to have fluid queries across them all, as if they were one database. RDBMS can't do that, but Datomic can, and Git can too - consider submodules. Immutability is what makes it possible to shard like this without sacrificing query expressiveness, strong consistency or ACID (like every other distributed system that isn't accumulate only)

mej10 · on Nov 10, 2017

I was under the impression, based on its docs, that Datomic only supports processing transactions serially through a single transactor.

SamReidHughes · on Nov 10, 2017

Not surprisingly, it also says it's not designed for high throughput write workloads, the topic of this blog post.

dustingetz · on Nov 11, 2017

Both what you write and what I wrote are true.

To scale writes you shard writes. This generally means running multiple databases (in any DBMS)

The key insight is that Datomic can cross-query N databases as a first class concept, like a triple store. You can write a sophisticated relational query against Events and Photos and Instagram, as if they are one database (when in fact they are not).

This works because Datomic reads are distributed. You can load some triples from Events, and some triples from Photos, and then once you've got all the triples together in the same place you can do your queries as if they are all the same database. (Except Datomic is a 5-store with a time dimension, not a triple store.)

In this way, a Datomic system-of-systems can scale writes, you have regional ACID instead of global ACID, with little impact to the programming model of the read-side, because reads were already distributed in the first place.

For an example of the type of problems Datomic doesn't have, see the OP paragraph "To sort, or not to sort?" - Datomic maintains multiple indexes sorted in multiple ways (e.g. a column index, a row index, a value index). You don't have to choose. Without immutability, you have to choose.

noahdesu · on Nov 11, 2017

What is the primary reason people choose Datomic? From reading the website, I get the impression that the append-only nature and time-travel features are a major selling point, but in other places its just the datalog and clojure interfaces. I'm sure it's a mix, but what brings people to the system to begin with?

dustingetz · on Nov 11, 2017

Datomic is my default choice for any data processing software (which is almost everything).

Immutability in-and-of-itself is not the selling point. Consider why everyone moved from CVS/SVN to Git/dvcs. The number of people who moved to git then and said "man I really wish I could go back to SVN" is approximately zero. Immutability isn't why. Git is just better at everything, that's why. Immutability is the "how".

I don't see why I would use an RDBMS to store data ever again. It's not like I woke up one day and started architecting all my applications around time travel†. It's that a lot of the accidental complexity inherent to RDBMS - ORM, N+1 problems (batching vs caching), poorly scaling joins, pressure to denormalize to stay fast, eventual consistency at scale...

Datomic's pitch is it makes all these problems go away. Immutability is simply the how. Welcome to the 2020s.

†actually I did, my startup is http://hyperfiddle.net/

noahdesu · on Nov 11, 2017

Thanks for the explanation, and I definitely agree with all your points. The reason I ask is that over the past year or so, we have been hacking on a little side project that maps a copy-on-write tree to a distributed shared log (https://nwat.io/blog/2016/08/02/introduction-to-the-zlog-tra...). This design effectively produces an append-only database with transactions (similar to the rocksdb interface), and means that you can have full read scalability for any past database snapshot. We don't have any query language running on top, but have been looking for interesting use cases.

dustingetz · on Nov 12, 2017

A lot of stuff in those links that I'm not familiar with. I'd expect challenges around what will happen when the log doesn't fit in memory? What if even the index you need doesn't fit in memory? If I understand, zlog is key-value-time, so I'm not sure what type of queries are interesting on that. Datomic is essentially a triple store with a time dimension so it can do what triple stores do. What do you think the use cases are for a key-value-time store?

noahdesu · on Nov 12, 2017

The log is mapped onto a distributed storage system, so fitting it in memory isn't a concern, though effective caching is important. The index / database also doesn't need to fit into memory. We cap memory utilization and resolve pointers within the index down into the storage system. Again, caching is important.

If I understand Datomic correctly, I can think of the time dimension in the triple store as a sequence of transactions each of which produces a new state. How that maps onto real storage is flexible (Datomic supports a number of backends which don't explicitly support the notion of immutable database).

So what would a key-value-time store be useful for? Conceptually it seems that if Datomic triples are mapped onto the key-value database then time in Datomic becomes transactions over the log. So one area of interest is as a database backend for Datomic that is physically designed to be immutable. There is a lot of hand waving, and Datomic has a large number of optimizations. Thanks for answering some of those questions about Datomic. It's a really fascinating system.

dustingetz · on Nov 12, 2017

My questions were in the context of effective caching in the query process, sorry that wasn't clear. A process on a single machine somewhere has to get enough of the indexes and data into memory to answer questions about it. Though I guess there are other types of queries you might want to do, like streamy mapreduce type stuff that doesn't need the whole log.

I will have to think about if a key-value-time storage backend would have any implications in Datomic. One thing Datomic Pro doesn't have is low-cost database clones through structure sharing, and it doesn't make sense to me why.

Here is a description of some storage problems that arise with large Datomic databases, though it's not clear to me if ZLog can help with this < http://www.dustingetz.com/ezpyZXF1ZXN0LXBhcmFtcyB7OmVudGl0eS..., >

noahdesu · on Nov 12, 2017

> A process on a single machine somewhere has to get enough of the indexes and data into memory to answer questions about it.

This common challenge is perhaps magnified in systems that have deep storage indexes. For example, the link you posted seems to suggest that queries in Datomic may have cache misses that require pointer chasing down into storage, adding latency. Depending on where that latency cost is eaten, it could have a lot of side effects (e.g. long running query vs reducing transaction throughput).

This problem is likely exacerbated in the key-value store running on zlog because the red-black tree can become quite tall. Aggressive, optimistic caching on the db nodes, and server-side pointer chasing helps. It's definitely an issue.

I don't know anything about Datomic internals, so disclaimer, I'm only speculating about why Datomic doesn't have low cost clones: which is that the underlying storage solution isn't inherently copy-on-write. That is, Datomic emulates this so when a clone is made there isn't an easy metadata change that creates the logical clone.

Despite the lack of features etc in the kv-store running on zlog, database snapshots and clones are both the same cost of updating the root pointer of the tree.

mej10 · on Nov 11, 2017

Ah, I have looked into Datomic a bit but didn't realize this, though it makes sense in retrospect.