More

hruk · 2025-11-01T14:52:37 1762008757

This is just untrue - the naive implementation (make the API call, write a single row to the db) will work fine, as transactions are quite fast on modern hardware.

What do you consider "serious" work? We've served a SaaS product from SQLite (roughly 300-500 queries per second at peak) for several years without much pain. Plus, it's not like PG and MySQL are pain-free, either - they all have their quirks.

mickeyp · 2025-11-01T15:42:37 1762011757

Edit: disregard. I read it as he'd done it and had contention problems.

I mean it's not if he's got lock contention from BUSY signals, now is it, as he implies. Much of his issues will stem from transactions blocking each other; maybe they are long-lived, maybe they are not. And those 3-500 queries --- are they writes or reads? Because reads is not a problem.

hruk · 2025-11-01T16:06:32 1762013192

Roughly 80/20 read to write. On the instance's gp3 EBS volume (which is pretty slow), we've pushed ~700 write transactions per second without much problem.

mickeyp · 2025-11-01T16:12:33 1762013553

For small oltp workloads the locking is not going to be a problem. But stuff that holds the write lock for some measurable fraction of a second even will gum things up real fast. Transactions that need it for many seconds? You'll quickly be dead in the water.

hruk · 2025-10-01T11:25:06 1759317906

This is basically Breiman's "two cultures" at play. Do you care about optimizing y-hat, or do you care about doing inference on some parameters in your model? Depends on the business case, typically.

hruk · 2025-09-30T23:40:10 1759275610

You can do fairly well here with ridge regression as a poor man's hierarchical model. We've used this library's Bayesian ridge regression to support a geo-pricing strategy (and it contains the Dirichlet-Multinomial approach as well): https://github.com/bayesianbandits/bayesianbandits

hruk · 2025-09-11T23:17:21 1757632641

All databases are painful in their own way. I've used all three at various times in my career, and I think SQLite behaves quite predictably, which has made it a lot easier for me personally to administrate.

If I had to start something new, I'd use SQLite until at least high 5 digit queries per second. Maybe more.

hruk · 2025-09-01T01:58:12 1756691892

Agree on many things here, but SQLite does support WAL mode which supports 1 writer/N writer readers with snapshot isolation on reads. Writes are serialized but still quite fast.

SQLite (actually SQL-ite, like a mineral) maybe be light, but so are many workloads these days. Even 1000 queries per second is quite doable with SQLite and modest hardware, and I've worked at billion dollar businesses handling fewer queries than that.

hruk · 2025-06-22T15:30:27 1750606227

We've used this Python package to do this: https://github.com/bayesianbandits/bayesianbandits

hruk · 2025-05-27T12:22:56 1748348576

FWIW, I've been running a system with roughly 100K users, about 25 qps on average, with a single SQLite file for several years. No issues with data.

noduerme · 2025-05-27T12:46:51 1748350011

That's... pretty amazing. It sounds crazy to me, I'm obsessive about hourly backups, but do you use something like Litestream to keep copies?

arkh · 2025-05-27T13:21:54 1748352114

From some months ago: https://news.ycombinator.com/item?id=43076785

> searchcode.com’s SQLite database is probably one of the largest in the world, at least for a public facing website. It’s actual size is 6.4 TB.

hruk · 2025-05-28T10:00:30 1748426430

Yep, we use Litestream. It's been very reliable.

hruk · 2025-02-22T01:32:55 1740187975

We've used this library for Bayesian contextual bandits in production (we have a critical business use case supported by a ~200K feature sparse Linear UCB bandit). It's a small community, but it's also a small enough codebase that we've read through all of it and feel fine about maintaining it ourselves in case it goes inactive.

https://github.com/bayesianbandits/bayesianbandits

hruk · 2025-02-19T00:26:50 1739924810

We do zero-downtime deployments with a single Docker volume containing the db. Spin up a container running the new code, wait til it's healthy, then kill the old container.