Hacker Newsnew | past | comments | ask | show | jobs | submit | hruk's commentslogin

bruh


This is just untrue - the naive implementation (make the API call, write a single row to the db) will work fine, as transactions are quite fast on modern hardware.

What do you consider "serious" work? We've served a SaaS product from SQLite (roughly 300-500 queries per second at peak) for several years without much pain. Plus, it's not like PG and MySQL are pain-free, either - they all have their quirks.


Edit: disregard. I read it as he'd done it and had contention problems.

I mean it's not if he's got lock contention from BUSY signals, now is it, as he implies. Much of his issues will stem from transactions blocking each other; maybe they are long-lived, maybe they are not. And those 3-500 queries --- are they writes or reads? Because reads is not a problem.


Roughly 80/20 read to write. On the instance's gp3 EBS volume (which is pretty slow), we've pushed ~700 write transactions per second without much problem.


For small oltp workloads the locking is not going to be a problem. But stuff that holds the write lock for some measurable fraction of a second even will gum things up real fast. Transactions that need it for many seconds? You'll quickly be dead in the water.


This is basically Breiman's "two cultures" at play. Do you care about optimizing y-hat, or do you care about doing inference on some parameters in your model? Depends on the business case, typically.


You can do fairly well here with ridge regression as a poor man's hierarchical model. We've used this library's Bayesian ridge regression to support a geo-pricing strategy (and it contains the Dirichlet-Multinomial approach as well): https://github.com/bayesianbandits/bayesianbandits


All databases are painful in their own way. I've used all three at various times in my career, and I think SQLite behaves quite predictably, which has made it a lot easier for me personally to administrate.

If I had to start something new, I'd use SQLite until at least high 5 digit queries per second. Maybe more.


Agree on many things here, but SQLite does support WAL mode which supports 1 writer/N writer readers with snapshot isolation on reads. Writes are serialized but still quite fast.

SQLite (actually SQL-ite, like a mineral) maybe be light, but so are many workloads these days. Even 1000 queries per second is quite doable with SQLite and modest hardware, and I've worked at billion dollar businesses handling fewer queries than that.


We've used this Python package to do this: https://github.com/bayesianbandits/bayesianbandits


FWIW, I've been running a system with roughly 100K users, about 25 qps on average, with a single SQLite file for several years. No issues with data.


That's... pretty amazing. It sounds crazy to me, I'm obsessive about hourly backups, but do you use something like Litestream to keep copies?


From some months ago: https://news.ycombinator.com/item?id=43076785

> searchcode.com’s SQLite database is probably one of the largest in the world, at least for a public facing website. It’s actual size is 6.4 TB.


Yep, we use Litestream. It's been very reliable.


We've used this library for Bayesian contextual bandits in production (we have a critical business use case supported by a ~200K feature sparse Linear UCB bandit). It's a small community, but it's also a small enough codebase that we've read through all of it and feel fine about maintaining it ourselves in case it goes inactive.

https://github.com/bayesianbandits/bayesianbandits


We do zero-downtime deployments with a single Docker volume containing the db. Spin up a container running the new code, wait til it's healthy, then kill the old container.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: