Mnesia isn't in-memory only. It also journals to disk. You can also use disk onl...

Mnesia isn't in-memory only. It also journals to disk. You can also use disk only tables that don't hold the whole table in memory (but from what I've read, perf sucks... otoh, a lot of what people say about Mnesia conflicts with my experience, so maybe disc_copies is worth trying).

OTP ships with mnesia_frag which allows fragmenting a logical table into many smaller tables. You don't need to have all of the tables on all of the nodes that share an mnesia schema. That's at least one way to scale mnesia beyond what fits in memory on a single node. Single nodes are pretty big though; we were running 512GB mnesia nodes 10 years ago on commodity hardware, and GCP says 32TB is available. You can do a lot within a limit of 32TB per node.

There's other ways to shard too, at WhatsApp pre-FB, our pattern was to run mnesia schemas with 4 nodes where one half of the nodes were in service, the other was in our standby colo, all nodes had all the tables in this schema, and requests would be sharded so each schema group would only serve 1/N users and each of the two active nodes in a schema group would get half of the requests (except during failure/maintenance). We found 4 node schemas were easiest to operate, and ensuring that in normal operations, a single node (and in most cases, a single worker process) would touch specific data made us comfortable running our data operations in the async_dirty context that avoids locking.

We did have scaling challenges (many of which you can watch old Erlang Factory presentations about), but it was all surmountable, and many of the things would be easier today given improvements to BEAM and improvements in available servers.