I personally would want to use an immutable functional language wherever I can and only not use it if I have a good reason not too. Immutability makes reasoning about programs significantly easier, especially if they rely on concurrency.
And for finance it particular it's a very natural fit because there's just a lot of transformation of data and business logic.
Datomic is immutable in the sense that what you had for lunch today doesn't change what you had for lunch yesterday, where "lunch" is any arbitrary fact stored in the database.
I.e., you can ask to look at the entire database as it was yesterday, and run arbitrary queries against it.
You can also do speculative updates to it, in the sense of "show me the entire database as it would be if I were to have pizza for lunch".
It models this as a strictly linear succession of assertions and retractions of facts. Yesterday, `A` was true, today `A` is no longer true. While this new fact is recorded, it doesn't change the fact that yesterday, `A` was true.
What we see in reality is that append-only database is unusable without making additional "projections" or whatever you call them, databases that are ready to be queried/updated, with maybe specific denormalizations, indexes and so on.
And oh, btw, those later databases are not "imutable".
Dstomic is an immutable log (kind of like git). the only operation is append. there is a head pointer stored pointing to “latest”, this is the point of mutation you’re looking for, and it’s the only point.
There are a lot of different kinds of financial institutions with a lot of different kinds of needs. In general, however, functional languages are a good fit for highly regulated domains, because they encourage splitting the (stateless) business legal rules for the domain from the stateful data management.
Complecting the two into, e.g. an Account object that has both metadata related to an account and e.g. rules related to transactions that can be part of an account quickly turns into an expensive maintenance nightmare.
Looks like picking Clojure and Datomic has created a great deal of technical debt for them. They started adding Spec everywhere to specify their data because they were having big problems scaling their wild west code base. But now Spec is dead and there's a new Spec2.alpha version.
I don't even know how they have manage to scale datomic to that level, the support contract we had for datomic was only really used to report bugs[0] but they have more than 2000 datomic transactors? ouch.
[0] Yes, too much bugs and slow, but databases are hard so I guess this was expected for a closed-source niche DB with little users.
I was at the talk. From having engaged with a lot of organizations working at scale, it was pretty clear they were near the top in terms of not being hampered by technical debt. They have the capability to move quickly and evolve the system in pragmatic ways to dial the knobs towards correctness or speed or functionality or whatever. A tremendous achievement and speaks volumes for the tooling.
There is no "Datomic scale" problem. Datomic transactors are just another singleton service to deploy with your microservice pod. The underlying storage can obviously be consolidated and scaled independently. I don't recall what they're using, would guess it is pg.
A single postgres daemon- a unit of scaling- can support many datomic transactors, each talking to a named "database" hosted by that one daemon.
You don't have to scale your postgres daemons with your microservices- each of which have their own transactor- which would be painful and out of the ordinary for an ops team.
Scaling datomic on top of postgres is no different from scaling any other microservice.
That all said, the architectural point that did sound painful in the presentation- and which is a common pain point in microservice architectures, not unique to what NuBank is doing with Datomic- was having to maintain an ETL for analytics purposes, to pull together all of the distinct microservice-specific Datomic-hosted data sets into a single uniformly SQL-queryable data set. The details of that implementation, and whether it made use of the new SQL interface supported by Datomic, were not discussed. But it smelled brittle and fragile.
A single postgres daemon- a unit of scaling- can support many datomic transactors
yes
,each talking to a named "database" hosted by that one daemon.
No. That would imply distributed writes and break the single write serializability property of datomic. Think about it, transactors don't sync with each other(only one for HA but that's orthogonal).
To put it simply, multiple transactors can share the same storage but only one can write to a single database at a time.
Ok, I never tried but I don't see any reason why it wouldn't work but also don't know how the peer library handles multiple connection objects. Anyway this is a different issue from having multiple transactors configured to write to the same logical database.
it is interesting to consider because it enables a constellation architecture, where writes and reads are sharded independently. you could imagine a distributed computation like a social network where each person owns their own transactor but you can still query across the parts of the social graph you care about.
They officially launched in 2014, I believe. They have some 20 million customers, last I checked, which gives ~10k customers per day on average from 2014. It makes sense that it would have been fewer per day in the beginning, and then accelerated.
Datomic scales horizontally for reads but doesn't for writes, but then natively mysql doesn't scale for reads or writes, most people looking to scale mysql up use a clustered approach with many readers and one writer using a third party or enterprise solution
I don't know of a reliable multi write system for mysql that doesn't make significant trade offs
At work we use multiple mysql servers to handle "scaling" so it's not a surprise to me that Nubank are using multiple Datomic servers
Unless you start getting into eventual consistency territory I don't think distributed writes are a trivial problem and targeting Datomic for this is a bit odd
Also the spec stuff I don't think it's a surprise to Nubank that spec is being upgraded it literally says alpha in the namespace, they have some great Clojure engineers at Nubank particularly the developer of Pathom they're really pushing the envelope in terms of distributed graphs and front end
Not only that, but they had intentions of using `Clojure for everything`. From configuration management (currently using ruby if I recall correctly) to deployment, they want to use Clojure on all fronts.