(I work at Notion, one of the larger Notion clones) We experimented with using p...

tudorg · 2024-06-02T12:50:19.000000Z

Thanks a lot for commenting and pointing me to the blog post. I do think I've seen it before but forgot about it. I've re-read it now and it's a great read!

From what I understand you decided to do sharding in the application code, and given the current state I think that makes total sense and I'd have probably done the same.

Part of my point with the blog post is that there is a built-in horizontal sharding solution in vanilla Postgres (partitioning + FDW), but it's currently badly lacking when it comes to cluster management, schema changes, distributed transactions, and more. If we put work into it, perhaps we tip the balance and the next successful Notion clone could choose to do it at the DB level.

Ozzie_osman · 2024-06-02T15:24:37.000000Z

Thanks for this commentary! At a startup where we are preparing to shard Postgres. I'd be curious if you're familiar with AWS limitless, and how you would have approached deciding whether to use it vs. the approach in the blog post had it existed back in 2021.

jitl · 2024-06-02T16:14:18.000000Z

I’m a solid “no” on any primary datastore database thingy that’s under 5 years of industry wide production workload experience, and would seriously weigh it against something with 10+ years industry use.

In 2019 when I was interviewing at companies for my next position I heard from a few places that the original Aurora for Postgres lost their data. It seems like the sentiment on mainline Aurora has improved a bit, but I would never bet my company’s future on an AWS preview technology. Better the devil you know (and everyone else knows).

dilyevsky · 2024-06-03T16:56:18.000000Z

What about CockroachDB? There are real-world, large-scale deployments of it (e.g Netflix) going back more than 5 years easy.

jitl · 2024-06-04T20:50:38.000000Z

It might be a good choice, I don't know enough about either the technology or the market for CockroachDB expertise.

sgarland · 2024-06-02T15:44:36.000000Z

My biggest concern with Limitless – other than inherent performance issues with Aurora – is that according to their docs, it’s built on Aurora Serverless. IME, Serverless anything tends to get absurdly expensive very quickly.