Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Have you used SQLite as a primary database?
518 points by barryhennessy on April 25, 2022 | hide | past | favorite | 325 comments
I periodically hear about projects that use/have used sqlite as their sole datastore. The theory seems to be is that you can test out an idea with fewer dependencies (and cost) and that it scales surprisingly far.

There are even distributed versions being built for reliability in the cloud: dqlite by canonical (of Ubuntu fame) and rqlite

Given the complexity it seems like there are use cases or needs here that I'm not seeing and I'd be very interested to know more from those who've tried.

Have you tried this? Did it go well? Or blow up? Were there big surprises along the way?

- https://sqlite.org - https://dqlite.io - https://github.com/rqlite/rqlite




Here's an all-time great post about why you might consider SQLite in production with data about performance: https://blog.wesleyac.com/posts/consider-sqlite

I use SQLite in production for my SaaS[1]. It's really great — saves me money, required basically no setup/configuration/management, and has had no scaling issues whatsoever with a few million hits a month. SQLite is really blazing fast for typical SaaS workloads. And will be easy to scale by vertically scaling the vm it's hosted on.

Litestream was the final piece of the missing puzzle that helped me use it in production — continuous backups for SQLite like other database servers have: https://litestream.io/ With Litestream, I pay literally $0 to back up customer data and have confidence nothing will be lost. And it took like 5 minutes to set up.

I'm so on-board the SQLite train you guys.

[1] https://extensionpay.com — Lets developers take payments in their browser extensions.


This was so inspiring to read. It's a very balanced take about the pros and cons of using SQLite vs. Postgres at scale.

I say "inspiring" because using SQLite reminds me of the simplicity and productivity from coding for the "early web" that lost 10-15 years ago. The days when you could spin up a website without worrying about a bunch of ancillary services and focus on the app itself.

For me, SQLite's lack of online schema changes seems like perhaps the biggest blocker to actual production. I've never had a production project where the schema didn't change a lot.


I have this beef too. Tooling for dumping and restoring into a new schema are easy/simple/fast. So, these schema migrations can happen w/o issue. Some tricks with the PRAGMA directive in SQLite so you can roll out changes (eg: code supports old/new schema while migrating)


Interesting, that's super cool to read.

   Tooling for dumping and restoring into a new schema are easy/simple/fast.
Any resources you can point to that expand on this? Is this standard SQLite tooling? I'm curious how it would perform with large-ish databases - a few hundred GB or perhaps several TB.

(This is one of those things where I can "just Google it" but I was wondering if perhaps there was a particularly useful article that points out potential gotchas, etc)


I'm just using shell scripts and the SQLite CLI. I dump/restore files to the same FS so I can mv into place when ready.

Checkout the PRAGMA user_version, I just modify on schema change, so app knows where stuff is.

https://www.sqlite.org/pragma.html


We also used SQLite and leverage PRAGMA user_version to do the trick. But decided to move to PostgreSQL mainly due to the schema migration constraints.

Our experience is SQLite is OK for small to medium projects, but when the business logic/data model becomes more complex, SQLite is not sufficient.

More discussions are captured here https://news.ycombinator.com/item?id=31038614


> With Litestream, I pay literally $0 to back up customer data and have confidence nothing will be lost.

This is not a guarantee Litestream makes (nor it can, since replication is async).

You'll lose things to catastrophic failures, but chances are you'd be able to restore to a last known good checkpoint.


What is the point of using SQLite under a web service?

I thought people complained how MySQL sucks and PostgreSQL rocks for being right and SQLite was nowhere near being right or performant. (Things seem to be getting better with strict column types these days.)

I've recently migrated a smallish service from MySQL to PostgreSQL and figured it's quite a work if you're not careful writing by the SQL standard which means if the service had gotten bigger, your chance of moving away from SQLite kind of walks away.

So, why not use a safer choice to begin with? Nothing is complicated running MySQL/PostgreSQL unless you've sold yourself to AWS to care for the cost and don't know how to run a DB instance yourself.


SQLite is far faster than Postgres or MySQL, however, the price you pay for this is having a single writer thread, and it's a library incorporated into your process, not a shared DB. It's faster because those other features of a server have a cost as well, particularly the cost of write arbitration.

SQLite is fine when all your load can be served by a single backend process on a single machine. The moment you need multiple backends to handle more load, or the moment you need high availability, you can't do it with SQLite. SQLite has very limited DDL operations, so you also can't evolve your schema over time without downtime. Now, for streaming backups - how do you come back from node failure? You're going to incur downtime downloading your DB.

I run many SQL backed production services, and my sweet spot has become a big honking Postgres instance in RDS in AWS, with WAL streaming based replicas for instant failover in case of outage, and read replicas, also via WAL. This system has been up for four years at this point, with no downtime.

I love SQLite, and use it regularly, just not for production web services.


> my sweet spot has become a big honking Postgres instance in RDS

Why do you prefer PostgreSQL RDS over Aurora RDS? Aurora seems better in every way but price[1]. (I know it also had some growing pains at launch.)

[1]: Amazon RDS for PostgreSQL is ideal when you have a small-to-medium intense workload. It works best when you have limited concurrent connections to your database. If you’re moving from commercial database engines such as Oracle or Microsoft SQL Server, Aurora PostgreSQL is a better choice because it provides matching performance with a lower price.

https://aws.amazon.com/blogs/database/is-amazon-rds-for-post...


Price is the main reason. Write performance is a second one.


The article I linked implies that Aurora offers higher write performance:

> If your database workload reaches the Amazon RDS max limit of 80,000 IOPS, and it requires additional IOPS, Aurora PostgreSQL is the preferred database solution. If database workload requires less than 80,000 IOPS, you can choose either Amazon RDS for PostgreSQL or Aurora PostgreSQL based on supported features.


SQLite has write ahead logging as well

https://sqlite.org/wal.html


It doesn't let you run an active replica from WAL.


> So, why not use a safer choice to begin with?

I know several people who build projects like that. It take them months to get a working product, just to discover it doesn't interest people or doesn't work like they expected. If for every piece of tooling you go for the "safe" and most performant one you gain bloat and complexity real quick.

People underestimate "simple" tech performance, in 99% of projects by the time your bottleneck is your DB system I can assure you that it'll be the least of your concerns


Which part of running MySQL instead SQLite is over engineering?


- Setting up a MySQL server on both your dev machine and server, and making sure they're the same version (extra fun if they're on different OS versions)

- Setting up an out-of-repo config file on the server with your MySQL credentials

- Setting up a backup script for your server data

It's only about an hour of work total, but it's an hour of work that I hate doing.


Hum... MySQL is one of those labor-generating technologies that a Luddite would fall in love for. Those are best avoided. But for Postgres...

- You make sure the production version is larger or equal than the development, or you make sure to not use new features before they reach production, what is quite easy. There is no problem with different OSes (except for Windows itself not being very reliable, but I imagine you are not using Windows on production, as it's another one of those labor-generating techs).

- Trusting a local user is the same level of security you get with SQLite, no credentials required.

- And setting a backup script... Wait, you don't do that for SQLite? There's something missing here.

Yes, there are a lot of small tasks that add up when setting some new software. It's a pain. But it's a pain you suffer once, and it's over. It's worth optimizing, but not at any ongoing cost.


docker-compose is the way to go for keeping the dev versions synced with the production version. And for the backup script scheduled mysqldump and copy to storage should see you through quite far, so not really any more effort than copying an SQLite database.


This person is talking about reducing complexity, so I don't think adding more moving pieces to the machine is the way to go.


and now you need to learn docker and 10 other things.


So you trade for some risk for an hour.


Please do explain what risk you're thinking of, as anyone smart enough to write their own SaaS would not put resources in the web server's file system tree? You stick your db file in an normal secure location outside the server's root, chmodded appropriately so that it suffers the exact same risks as any other file on the OS. It's no more or less risky than /etc/shadow, while being considerably easier to work with (and less failure-prone for light db work) than an independently running database service.


The risk is as I had written previously that it takes some effort to move away from a db to another when the need arises when I see no benefit in choosing SQLite in the beginning.

I'm not a professional db engineer but one point is that there doesn't seem to be a way to create functions in SQLite which would mean creating triggers on various tables can cause excessive amount of duplicate code.

If I rely on PostgreSQL, I feel covered for my use case for web apps but once you hit some little gotchas in SQLite, you may regret about saving 10 minutes (install db and set up a password) for nothing.


If a project of mine ever gets enough traffic to force me away from SQLite, I expect I'll be rolling in dough and willing to put in the effort.


That's not a risk, that's just an inefficiency further down the line (migrating data from sqlite to a "real" database can indeed be quite a chore, but far less so if you formalized your schema and constraints beforehand, so that a migration mostly involves exporting to SQL, rewriting where needed, and then importing into whatever dbms you end up using later on in the lifetime of your project).

When we're talking about risks, think security exploits: how is sqlite3 more likely to get your data leaked, or flat out copied in its entirety, compared to using a mysql/postgres/etc.


Inefficiency or not, if you start down the path of SQLite, you need to invest good amount of time refactoring into another DB if you feel like migrating away.

When I realized SQLite would store any type of data in any kind of column type, it was obvious SQLite is different from others. They only added strict types about a year ago but scared me enough not to use it.

And how is SQLite any less secure? You can flat out copy its entirety using pg_dumpall. I'm not talking about security.


While that's an unfortunate discovery, I do need to point out that that's a problem of your own making, not SQLite's fault. SQLite is very clear on what you can expect, with dynamic typing being one of the main differences between it and "regular SQL" databases.

Yes, plenty of folks just go for SQLite blindly, but the consequences of that aren't SQLite's doing: if you take the time to read through https://www.sqlite.org/features.html because you want to know what it can actually do, you'll almost certainly click through to https://www.sqlite.org/omitted.html because you'll want to know what it doesn't do, and then you'll see the "See also the Quirks, Caveats, and Gotchas of SQLite." link and you're going to follow it an read through https://www.sqlite.org/quirks.html because those sound pretty important to know about before you commit to using something that is going to be your application/service data store for the foreseeable future.


your app has full access to the SQLite file. MySQL/PostgreSQL have users and permissions. Security is about layers, and SQLite is removing one layer of security. You can, for example, put DELETEs or access to certain tables on a separate user that your web app has no access to. With SQLite, if your app gets hacked then they can do anything with the whole DB they want to. In addition, with a separate DB process you get audit logs. If someone hacks your SQLite app they may have access for months before you realize it, if you ever do. Especially if they are doing something subtle like UPDATEs on specific tables/fields that may go unnoticed but provide the hacker some benefit. This is why you can't simply rely on the idea of using a backup. That's only going to help if the hacker totally trashes your DB.

With a separate DB you may have a hope of detecting when someone hacked your app. But without that firewall, the question becomes: how much of the data in my SQLite can now be trusted? If you don't know what backup is safe to restore, then you can't trust any of it.

Again, this is about layers. Not saying MySQL/Postgres will save you. But they can increase the odds.


If your server or API can be exploited, it doesn't matter whether there's an auth layer in between. Your SQL server runs as a service to connect to, your sqlite3 file is a file that you need access to. They're the same kind of layer: you need to break through the server's security to ever get to them directly, and if your app gets hacked such that the hackers gain file system access, then:

1. You're fucked. The end. It doesn't matter whether you were using mysql, postgres, or sqlite3, or S3, or Redis, or any other server your app was connecting to: they can just look at your environment vars.

That's not going to happen "because you're using Sqlite3", that's going to happen because you used some obscure server software, or worse, rolled your own.

People really do seem to put too much faith into "it has a username and password, it's more secure". It's not: if someone has access to your actual server, they have access to everything your server has access to. Sqlite3 is no more or less secure than a dbms daemon (or remote) in that sense.


With sqlite your server, api or application can be hacked. The most common and likely hack would be somewhere in your application. It really doesn't make sense to use sqlite here.

Setup a separate database server and use it for all of your projects. That one hour pays off each and every project.


Pieter Levels has been using SQLite for nomadlist and I think it's been going well for him.


Some risk of what?


Running MySQL/Postgres over SQLite:

- needs to be provisioned and configured

- needs additional tooling and operational overhead

- comes with a _large_ performance overhead that is only won back if you have quite a significant load - especially writes, which means the vast majority of web projects are slower and require more resources than they should.

- it makes the whole system more complex by definition

It is a cost-benefit thing that tilts towards RDBMS as soon as you need to sustain very high transactional loads and want a managed, individually accessible server that you can query and interact with while it's running in production.

But if it is just "a website that needs durability" then you haven't yet shown how that tradeoff is worth it.


Just wanted to add another’s scenario where postgresql has been useful to me. Functions. There are cases where you have expensive operation(s) that reference a lot of persistent data. Even without massive traffic these operations can be prohibitively expensive in the middleware. Leveraging database functions can be a massive performance improvement (100x + for me), especially if your middleware is slow (e.g. rails).

I’ve used SQLite in production once and it worked great. But that was a very simple app. For more complex (but not always higher traffic) I’m leaning more and more on postgresql and less on my middleware, like moving business logic to the database when it makes sense.


If you need the expressiveness and power of Postgres then sure, it has also way better JSON support for example, there is generally better tooling for it as well and so on. But in this case, your database becomes it's own _thing_, has much more value outside of being just durability for an application. Like for example Supabase is doing things. That's a very fundamental design decision IMO. I explored this and it is very attractive and robust, but serves different use-cases.


> What is the point of using SQLite under a web service? … people complained how MySQL sucks and PostgreSQL rocks for being right and SQLite was nowhere near being right or performant.

My understanding is that for read performance SQLite is pretty damn good, outperforming MySQL and Postgres in both single and concurrent tests. The key performance issue is the single global write lock. If your data access pattern is massively read biased then SQLite is a good choice performance wise, if you see a lot of write activity then it really isn't.

With regard to being correct*, it offers proper ACID transactions and so on. Typing is a big issue for some but far from all. It is significantly more correct than mysql used to be back before InnoDB became the default table type in ~2010, at least as correct as it now (aside from the data types matter depending on which side of that you sit on).


> What is the point of using SQLite under a web service?

Take a look at the Consider SQLite post I linked. They address your performance questions too.

For me, SQLite was a nice way to simplify launching and running my SaaS business and has had no downsides.


Any tips you can share on how to deploy your saas app without downtime when using sqlite?


For my personal project I'm planning to use a VM with an SSD. Manually I'll use caddy to switch over to a new running backend service with readiness check.

As for scaling if I need it I can increase disk space for the app server or scale out horizontally/vertically. Don't need that yet so I'm waiting for more details in the future to decide how to handle that.


Which backend “owns” the sqlite database?

That’s what I haven’t seen mentioned anywhere, if the database is part of the application, how do you switch from one version to another without downtime.


The VM owns the database. Multiple backend service versions run on that same VM. Usually just one but 2 while migrating traffic to a new version.

I haven't figured out details regarding having 2 processes both writing and reading to the SQLite DB at once. It might just be fine. With a 1 minute request timeout I can just shut down the previous version after 2 minutes (should receive no new requests after 1 minute) and caddy will have sent requests to the new version for a while.

Not sure which types of errors I'll be seeing but the client may need to retry requests in some cases.

This is all just what I've planned but hopefully kinda makes sense.


I would expect the performance of SQLite for queries against an index to outperform MySQL and PostgreSQL in all cases - because SQLite eliminates the need for network overhead by essentially executing those queries directly as a C function call.

So no matter how optimized MySQL and PostgreSQL are, SQLite will run rings around them for basic SELECT queries.


Running against a unix socket doesn't seem to make that a selling point for SQLite.


There's still serialization and deserialization overhead there. I would expect SQLite to win in benchmarks against MySQL or PostgreSQL on a Unix socket for basic "select * from table where id = 5" calls, but I've not done the work to prove it myself.


> I thought people complained how MySQL sucks and PostgreSQL rocks for being right and SQLite was nowhere near being right or performant

While "people" complaining is basically meaningless, I don't know why they'd be doing that about SQLite. It's used in most phones, all copies of Windows 10+, and countless other places.


Chrome browser has loads of them in your profile directory.


The Consider SQLite post mentions that one of SQLite’s in the past decade as “WAL mode (enabling concurrent reads and writes)”. Does this mean that the official advice to avoid SQLite for concurrent writes [1] is no longer a big concern?

[1]: https://www.sqlite.org/whentouse.html


I think the way it's worded in that SQLite documentation page is still accurate:

> SQLite supports an unlimited number of simultaneous readers, but it will only allow one writer at any instant in time. For many situations, this is not a problem. Writers queue up. Each application does its database work quickly and moves on, and no lock lasts for more than a few dozen milliseconds. But there are some applications that require more concurrency, and those applications may need to seek a different solution.

If your application needs to support hundreds of concurrent writes a second you shouldn't use SQLite. But that's a pretty high bar!


Also it should be noted that SQLite is a library, so it charges against the quotas for the main process.

This may become an issue with large run-time datasets even in read-only shared access scenarios.


Take this with a huge grain of salt because I am by no means an expert, but I currently am working on some scripts that import a few million rows into SQLite. I am using bash and the sqlite command line. I was getting a lot of concurrent write errors from sqlite (bear in mind i am only doing inserts of separate rows so in theory there is never an actual conflict), so I tried using WAL mode. It actually resulted in more contention. I ended up just going back to non-WAL mode and implementing an exponential backoff in bash to retry writes.


are you ingesting the data under 1 transaction? This is a common SQLite issue as writes aren't slow but transactions are. By default 1 write = 1 txn but you can put millions of writes into 1 txn and get many orders of magnitude speedup


Yes, each row is written with one call to the sqlite command line. I was going to do something where I wrote inserts into a file and flushed them into sqlite in batches, but sqlite is not the bottleneck, so once retries were taken care of, it was good enough.


Thanks for this comment! I had tried some bulk imports into SQLite and mostly gave up after hitting similar limits, thinking I was doing something wrong with the configuration.


Litestream is indeed a missing piece of the puzzle. But it also defeats some of the purpose of using an embedded database library in the first place. Now you're back to juggling separate processes once again.


https://github.com/backtrace-labs/verneuil is in-process (precisely to minimise deployment pain).


I had never heard of verneuil. Thanks for sharing. For anyone curious about the differences between the two:

"This effort is incomparable with litestream: Verneuil is meant for asynchronous read replication, with streaming backups as a nice side effect. The replication approach is thus completely different. In particular, while litestream only works with SQLite databases in WAL mode, Verneuil only supports rollback journaling"


Good point! Although practically speaking I don't mind at all. "Juggling" is too strong a word — it's literally just starting the Litestream process and never thinking about it again. It's nice that it just slides into my existing app without any code changes.


If the application is in go, you can likely embed litestream.


> hits a month

Is not a very useful performance metric. What is your peak hits per second?


I love watching people use "x per month" as some sort of architecture selection argument, especially when these arguments conclude in a proud justification of cloud sprawl.

There are single node/CPU solutions that can process 10-100 million business events per second. I am almost certain that no one logged into HN right now has a realistic business case on their plate that would ever come close to exceeding that capacity.

E.g.: https://lmax-exchange.github.io/disruptor/disruptor.html

This stuff isn't that hard to do either. It's just different.


ESAIDBUSINESSTOOMANYTIMES


Then replace business with the problem you're trying to solve, the good you're trying to do for your users, etc. Keeping the big picture in view doesn't have to mean you're only interested in making money.


I'm not criticising making money. Just s/business//g and it reads better.


I was looking at SQLite for a product I’m working on. It looks awesome and has improved significantly since I last looked.

The reason I decided against it is that it doesn’t have proper stored procedures. I use them a lot in PGSQL. They result in far fewer lines of code.

They also have the benefit of significantly reducing round trip calls to the database, which is one of the key advantages of SQLite.

But having used stored procedures for years, I can no longer bear the thought of writing SQL code in a host language, so I’m going to stick with PG for the time being.

Would be great to see something similar in SQLite; there are other advantages such as the single file database, that would work well in a microservice environment.


So you want to write functions in SQL? With SQLite you can define your own functions which can be called in your queries, but they do need to be written in your application language.


Yeah I know. But I find stored SQL to be a lot more concise than the application language, and less prone to bugs. I also find that it creates a much better separation of concerns between stuff that touches the database versus non-database stuff.

Just one example, error behaviour is well defined in PLPGSQL so I don’t have to constantly check for errors. That’s not true in my host language.

I am super impressed with SQLite, it’s just not a good fit for how I use databases yet.


What about zero downtime deploy of new version of your application? You have to take it down to restart, right?


"(5) ... SQLite allows multiple processes to have the database file open at once, and for multiple processes to read the database at once. When any process wants to write, it must lock the entire database file for the duration of its update. But that normally only takes a few milliseconds. ..." - https://www.sqlite.org/faq.html

You can start your new version of your application in a new process that opens the same database file, switch your load balancer to the new app, allow the old to drain all requests and then terminate the old app.


how do you handle things like encryption and access permissions?

The only thing I have against using SQLite in production (for my needs) is the lack of at rest encryption and row level permissions by user.


At rest encryption is a complicated subject (in general, I bet most people get negative net security from it). For SQLite, you can either encrypt your disk or get one of the versions with added encryption (I only know of proprietary ones).

You don't do row level permissions on your database. You keep it all on the application layer.


> You don't do row level permissions on your database. You keep it all on the application layer.

You say this as some sort of objective truth. Keeping security about data at the data layer can often be a really good idea and just as appropriate as GRANT/REVOKE at the table and column levels.


Well, if you want to use SQLite, you keep it at the application layer. That's an objective truth.

What is better, depends on what you are doing, and is not a simple choice in any way.


That's a tautology, isn't it? If the persistence engine you're using doesn't support a feature—no matter what that feature is—you'll have to fill in the shortfall somewhere else, such as the app layer.

I'm just saying if you have the option, if the engine does support the feature, and the solution is a good fit for your problem, you shouldn't shy away from it just because some other engine doesn't support it.


Hum, no. The word "tautology" doesn't mean that.

Anyway, yes, SQLite lacks a series of features you get on other SGDBs, often for good reasons. In exchange it brings an entirely different set of features. If you want Postgres, use Postgres.


One of my previous employers was using SQLite as a large distributed database - they had their own custom sharding strategy, but essentially the idea was to shard A accounts * B tables * C num_of_days with a .db file for every shard.

When I first came and saw it, it...did not sound right. But I didn't want to be the guy who comes in and says "you are doing it wrong" month 1. So I went along with it.

Of course, eventually problems started to pop up. I distinctly remember that the ingestion (happening via a lot of Kafka consumers) throughput was high enough that SQLite started to crumble and even saw WAL overruns, data loss etc. Fortunately, it wasn't "real" production yet.

I suggested we move to Postgres and was eventually able to convince everyone from engineers to leadership. We moved to a custom sharded Postgres (9.6 at the time). This was in 2016. I spoke to people at the place last month, and it's still humming along nicely.

This isn't to illustrate anything bad about SQLite, to be clear! I like it for what it does. Just to show at least 1 use case where it was a bad fit.

SQLite was a tempting first answer, but what solved it was Postgres, and we eventually offloaded a lot of aggregation tables to Clickhouse and turned the whole thing into a warehouse where the events got logged.


That's a good counter-case to keep in mind, thank you.

I guess the take away here is that this underscores that sqlite isn't for the 'large number of writers' scenario.

p.s. > I didn't want to be the guy who comes in and says "you are doing it wrong" month 1 Very wise


I also experience that 1 bad use case: heavy writes. I've always used SQLite first for any project, and when I did an in-house analytics tool for tracking user-initiated events (visits, button clicks, hovers, etc) I thought SQLite could handle it well. Unfortunately, even with heavy tuning, we saw WAL overruns and missing data.

Sadly, we had to move to Postgres and eat all that scaling complexity. :(


That's for sure a good point. SQLite has really well defined limitations, and most if not all of those are by design. You should consider them BEFORE starting a new project with it, but it it is a fit, it's a great experience.


Data loss is a pretty serious problem.

Do you have any more information about the situation?

Could it have been in the hand spun partitioning logic instead of SQLite?

What was the ingestion throughout roughly?


Unfortunately I don't have numbers on hand. We approximated our Postgres would ingest around 1 TB over the course of a year, later I think? I could be wildly wrong.

It's been more than 5 years but from what I remember, it definitely was _not_ the partitioning logic (the sharding just meant we had a huge amount of files that were hard to organise). But a single consumer doing heavy writes on a single SQLite file would see enough traffic that pretty soon you would start to see errors and your writes would start to break.


The sqlite docs page has a nice article [1] on when to use an embedded database such as sqlite and when to go with a client/server model (postgres, mysql or others)

When not to use sqlite:

- Is the data separated from the application by a network?

- Many concurrent writers?

- Data size > 280 TB

For device-local storage with low writer concurrency and less than a terabyte of content, SQLite is almost always better.

[1] https://www.sqlite.org/whentouse.html


What is the recommendation for offline capability with sync?


we considered PouchDB[1] (client) and CouchDB[2] (server) for an PWA back then (2017). nowadays i would probably favor WatermelonDB[3].

[1]: https://pouchdb.com/

[2]: https://couchdb.apache.org/

[3]: https://nozbe.github.io/WatermelonDB/


> For device-local storage with low writer concurrency and less than a terabyte of content, SQLite is almost always better.

Isn't MySQL MyISAM faster and this way constitute a better choice for a scientific number crunching application? I mean near 4GB DB, very simple schema, heavy reading load, little/no inserts and no updates.


With 4 Gb you might as well just load the data into RAM.


But then you have to implement all the SELECT and DML logic yourself. SQL makes this a breeze with JOIN, ON UPDATE CASCADE, etc. And being SQL, it is very easy to maintain, even by the PFY that replaces you.


SQLite (also H2 and some other embedded SQL databases) can be used entirely in-memory, one can also drop an SQLite file on a RAM-hosted filesystem (tmpfs/ramdrive). You really can put everything into RAM (and still enjoy SQL) if you have enough, don't mind long cold-load and potential data loss.


It looked to me that the GP was suggesting to keep the data in the application instead of in a DB. But yeah, I suppose he might have meant a HEAP table instead of MyISAM.


If your RAM exceeds the size of your tables and indexes, that data will be served from RAM in any modern relational database system. No special config usually necessary for the speed but you don't lose everything when the power goes out, unlike tmpfs/ramdrive option.


That would depend on the DB server settings. Such a config might be found on a dedicated database server, but I doubt such settings would make sense on a machine running e.g. an application server together with the database.


I'd argue adding SQL into the mix makes it difficult to maintain, mixed-language codebases are almost by definition complex, and you get significant chafing when mixing a declarative language like SQL and OOP.

Since this is a no-update and no live-insert scenario we're talking about, it's fairly easy to produce code that is an order of magnitude faster than a DBMS, since they're not only primarily optimized for efficiently reading off disk (an in-memory hash table beats a B-tree every day of the week), they've got really unfortunate CPU cache characteristics, and additionally need to acquire read locks.


Maybe this is a failure of imagination on my part, but won't most people be using ORMs? Again, talking about the use case of the average application that's light enough to get away with SQLite, it doesn't seem like you would need to be hand writing queries.


In my experience ORMs add a layer of complexity, instead of removing one. It's nice to e.g. have a "Pythonic" interface in Python, but when working close to the data I far prefer to write a concise, clear query instead of trying to remember some ORM syntax or what they're calling VARCHARS in this particular ORM, or how they're representing JOINS, or if the condition will be on the ON clause or the WHERE clause, or how they're representing GROUP BY, etc etc.


Wrote code for many years sans ORMs.

Two features I enjoy in ActiveRecord and other ORMs, and why I would consider them a good standard practice for most things that aren't "toy" projects.

1. Easy chainability. In ActiveRecord you can have scopes like `User#older_than_65` and `User.lives_in_utah` and easily chain them: `User.older_than_65.lives_in_utah` which is occasionally very useful and way more sane than dynamically building up SQL queries "by hand."

2. Standardization. Maintenance and ongoing development (usually the biggest part of the software lifecycle) tend to get absolutely insane when you have N different coders doing things N different ways. I don't love everything ActiveRecord does, but it's generally quite sane and you can drop a new coder into a standard Rails project and they can understand it quickly. On a large team/codebase that can equate to many thousands or even millions of dollars worth of productivity.

    I far prefer to write a concise, clear query instead 
    of trying to remember some ORM syntax
100% agree.

ActiveRecord strikes a good balance here IMO. An explicit goal of ActiveRecord is to make it painless to use "raw" SQL when desired.

On non-toy projects, I think a policy of "use the ORM by default, and use raw SQL for the other N% of the time when it makes sense" is very very sane.


ORMs integrate poorly in many languages, and perform strictly worse than hand-written SQL.

If you're just using the database for object persistence, which is common, it doesn't matter all too much. But that's not really the scenario we're discussing here, since the data is by the original problem statement, immutable.


With a database you generally are loading it into RAM, thanks to caching at the database and/or filesystem level, and you get all of the fun database features more or less for free.

There's a performance hit relative to skipping the database altogether and simply allocating 4GB of RAM and accessing it directly, of course.


DuckDB is the OLAP equivalent of SQLite, as far as I know.


> Isn't MySQL MyISAM faster

I think the performance MySQL has over sqlite comes from its multithreading more than the storage engine.

In my experience sqlite is just as fast as MyISAM for single threaded work.


MyIsam is not crashsafe. Anytime your server crashes the MyIsam database may get corrupted.

Faster, but at what price?


Why?


I dunno, just felt like conventional (since long ago) knowledge that MyISAM is the fastest of all SQL DBs in simplistic non-RAM scenarios. I'm not sure this is true so I ask.


The engine might be faster (I'm not sure) but SQLite has the advantage that it doesn't have to connect over a socket. Instead you load the SQLite library into your code and your application directly manipulates the database files. That's potentially a lot faster.


Before sqlite is definitely was said to be the fastest, I suspect the two are similar enough that it makes little or no difference these days and sqlite (unless the dynamic typing thing is an issue for you, and even that is going away as recent versions support at least some stricter type enforcement) safer and more “correct” than MyISAM in many ways.


It blew up big time. I would have saved myself lots of trouble if I had just gone with postgres from the getgo.

The workload was simple (single node work tracking) and I didn't expect it to become a bottleneck. Unfortunately, there were some default settings in the storage backend (tiny page size or WAL or something) that caused severe thrashing and a dearth of tooling to track down the issue. After making a custom build with custom instrumentation and figuring out the problem, I found an email thread where the sqlite community was arguing about this exact issue and the default settings in question. A couple of people had forseen the exact problem I had run into and suggested a fix. Their concerns were dismissed on the grounds that the problem could be configured away, and their concerns about discoverability of configuration were ignored completely. I wasn't thrilled with the crummy defaults, but seeing that the consequences had been forseen, considered, and dismissed despite what seemed like widespread consensus on the fix being simple... it really damaged my trust. How many more landmines did SQLite have?

Lack of perf tooling + bad defaults = recipe for pain.


Not to diminish your issue but choosing good defaults for software with a wide range of use cases is hard. One person's blindingly obvious use case is another's niche.


Exactly. That's why neglecting tools in favor of "simplicity" is so dangerous. If it Just Works, great! But what if it Just Doesn't?


More details please if you have them. What was the throughput? number of transactions? Data size?

If you have that thread would be great to see it as well.


It was chugging at ~100 inserts per second on about ~300k rows of <100 bytes each. Just inserts, no index (it was the first thing to go), no foreign keys, one table only. SQLite was spending all the disk bandwidth copying data in and out of a tiny designated working area and flushing. This was in 2013 -- I'm afraid I've forgotten the details. It's an old problem, long fixed by now.

The problem itself didn't concern me nearly as much as "no perf tools + no perf foolproofing." That's a rough combo. If a problem this simple required this much debugging, extrapolations to problems of any complexity are terrifying. I knew that simplicity implied limitations, but this lesson taught me that simplicity could also imply danger.


This is a common problem. Bulk inserts should be done within a single txn if possible. The limit isn't insert throughput, but transaction throughput.

I've commented elsewhere here for docs referencing this problem. It's FAQ#19 on the SQLite website.

Was your inserts based on HTTP requests or was it more of a batch process? we're they grouped in txns? Obviously user http requests would be harder to group up, but kudos if your service is handling 100 QPS writes as that's pretty rare level of scale approaching the real "need a beefy concurrent db server" type of problem statement.


The inserts were batched. That was the second thing I tried after getting rid of the index, and that brought it from 20/s to 100/s.

This wasn't a website, it was HPC job tracking. SQLite chugged so hard that the tail wagged the dog, though.


I'm not contradicting you, it is curious what you saw, but this SQLite FAQ does claim that 50k/s inserts on a normal computer HDD is possible. Maybe there was some other resource constraint or configuration somewhere. It is very curious.

> Actually, SQLite will easily do 50,000 or more INSERT statements per second on an average desktop computer. But it will only do a few dozen transactions per second. Transaction speed is limited by the rotational speed of your disk drive. A transaction normally requires two complete rotations of the disk platter, which on a 7200RPM disk drive limits you to about 60 transactions per second.


Could you tell us more about that configure option?


This happened in 2013. A few years later the fix went through. The bug itself is water under the bridge, what you're hearing is that I'm still leery of "no perf tools + no perf foolproofing." These judgements may be obsolete -- once burned, twice shy.


Simon Willison has written about using SQLite for a "Baked in data" architecture which is a super interesting method for some situations: https://simonwillison.net/2021/Jul/28/baked-data/

As he notes https://www.mozilla.org/ uses this pattern:

> They started using SQLite back in 2018 in a system they call Bedrock ... Their site content lives in a ~22MB SQLite database file, which is built and uploaded to S3 and then downloaded on a regular basis to each of their application servers.

I'm particularly interested in the "Sessions" extension (https://www.sqlite.org/sessionintro.html) and would love to hear if anyone has successfully used it for an eventually consistent architecture built on top of SQLite?


I designed something that used a local SQLite database on the client and a remote postgresql instance as the master. It used read and write queues at each end for sync and was eventually consistent.

Unfortunately it was far too advanced for the org and no one else understood it so it was canned in favour of a connected solution under the guise of ubiquitous internet access being available. This is proving to be a poor technical decision so my solution may have some legs yet.


Hey I actually do the same in a mobile app. I dump everything into a local database, when they are connected to the internet it syncs, lets it work offline.


Im working on something similar. Mine is a RFID scanner kiosk that uses a local db when offline. How do you manage sync conflicts?


In my workflow, I PULL when they initial get online and MERGE non conflicting changes and ACCEPT the changes with the latest timestamp if there is a conflict.

It's not perfect and people still complain, to do it cleanly I would need to prompt the user which change to take but I haven't figured out a clean way to do that yet.


What industry was this - if you don't mind me asking? Always interested in situations where internet may not be assured


If your database is just 22MB, probably even MS Access 2000 will perform adequately.


It is exceptionally great if you don't need parallel writes or have many terabytes of data - ie: for most services out there.

When embedding natively, like in a Rust app, the performance is better than any other RDBMs because no network/serialization overhead and being able to use pointers in-process if needed.

The DevOps story also is a dream: typically it is just a single file (optionally + some more for journaling) and setup is automated away (most language libs bundle it already), plus it is widely known since smartphone SDKs and all webbrowsers include/expose it.

A subtile advantage: the supported SQL subset is so small, that "if it works in sqlite, it will also work with $RDBMS" in most cases, but not the other way around. I always use it when getting started when in need of relational data, and only had to swap it out for postgres once, but not due to technical/scaling reasons (IT policy change & stuff).

Having said that, it is mind-boggling what kind of load you can handle with a small VPS that runs a Rust microservice that embeds it's own SQLite natively... that would be an expensive cluster of your typical rails/django servers and still have worse performance.


A slightly unusual use-case but for my work we have our own file format which is a thinly-veiled sqlite database. Originally we used a json file but we moved to sqlite for performance reasons once the files started getting to multi-gigabyte sizes.

It works great - there are ergonomic APIs in most languages, it’s fast and reliable, and great to be able to drop into an SQL shell occasionally to work out what’s going on. A custom binary format might be slightly more optimal in some ways but using sqlite saves so much work and means a solid base we can trust.


Considering adobe Photoshop use(d) SQLite for application file format, this could be very far from unusual.


When I maintained uptime.openacs.org (https://gitlab.com/siddfinch/uptime) and MyTurl (both running AOLserver) I wrote internal versions for a place I was working at.

I switched from Postgres to SQLite for a couple of versions, put mainly because Postgres wasn't "supported" I called SQLite an "internal database thing".

Worked flawlessly for about 7-8 years before both services were gobbled up into micro API services.

At the last count, we have about 14,000 services checked by uptime (about 1,000 every 5 minutes, 2,000 every 10 minutes, the rest every 15). Probably had about 60,000 tinyurls in MyTurl. We also ran the MyTurl urls through uptime every night to look for bad links. The system go hammered, often.

It took minor tweaking to get the the best performance out of the database and AOLserver has some nice caching features, which helped to take the load off the database a bit. But overall, it worked as well as the Postgres counterpart.

And now, I have to figure out why I never released the SQLite version of both.


I'm running a bunch of different read-only sites and APIs on top of SQLite using Cloud Run and Vercel - abusing the fact that if the database is read-only you can package up a binary DB file as part of a Docker container or application bundle and run it on serverless hosting.

This means it won't cost any money if it's not receiving any traffic, and it can scale easily by launching additional instances.

I wrote about my patter for doing this, which I call Baked Data, here: https://simonwillison.net/2021/Jul/28/baked-data/

A few examples are listed here: https://datasette.io/examples


Yes for all my sites: Nomad List, Remote OK, Hoodmaps, Rebase etc. No real issues at all.


I read in one of your Tweets that you use one database file per (unrelated) table to avoid corruption. Why did you move to this model? Are multiple tables per file really more easy to corrupt?


Yes! It kinda happened because I had no idea how SQLite worked so I thought this was normal.

Then I thought this is great and makes it easier to move the db file if it just has 1 table. So I can download it easily for local dev for ex.

And yes in case there's corruption which never rly happens, only one file thus table would be affected.

PS: one thing that really helped reduce issues was setting PRAGMA MODE to WAL


I'd be interested to know what kind of corruption you were facing.


Here's a Twitter thread with some numbers from pieterhg's use of SQLite in production: https://twitter.com/levelsio/status/1308406118314635266


Awesome to hear!

How do you handle this? Do you store the SQLite file somewhere like s3 or just in memory?

How does this work for such high traffic sites?


No in the filesystem on a VPS. All my sites just run on a VPS. Nothing fancy!


glad to see pieter here, I am keeping an eye on rebase ;)


I’ve worked on several projects with sqlite, both read and write heavy, all with high concurrency, with databases in the few hundred MB with 400k server clients, and 100 bare-metal servers running at capacity. The sqlite part of our system is never the problem. In our case sqlite has been an alternative to custom files on disk or replacing a spaghetti of hashmaps in memory. we also replaced a single postgresql instance with all customers into many sqlites per customer. Performance and reliability is why I always reach for it first. At this point I’m a zealot and would argue your first ‘data structure’ of choice should be an sqlite database :)


Did you ever run into the problem of the system running out of file descriptors? If each client gets its own database file, each connection will require a file descriptor.

For example, Linux limits the number of file descriptors to 1024 per process to each process can open up to 1024 files at any point in time.

I ran into this problem in the past in another project (the system ran out of file descriptors for another reason).


What pain points have you experienced with "many sqlites per customer". I'm considering transitioning to something similar but would love to know what pitfalls I might not be considering.


None really, but we have a fairly simple design where all the shared databases we attach are read-only (think big static lookup tables that a separate process takes care of updating.) I would probably avoid having databases attach contextually -- seems complicated and error-prone.


I'm also in the SQLite is a data structure camp. You get fast access and persistence for free. You can also fix a lot of issues just exploring the database offline.


Don't be afraid of a database process. They are not scary, and are certainly less scary to scale up than whatever you might need to do with SQLite. There's more help available and better tooling.

SQLite may shine in edge cases where you know you can outperform a regular database server and you know why, and you could build everything either way. SQLite could be a way to e.g. decentralize state, using local instances to do local storage and compute before shipping off or coordinating elsewhere.

Otherwise, SQLite can simply be a recipe for lots of lock errors on concurrent operations. I've also never been very impressed with its performance as a general purpose replacement for postgres or MySQL.


> tooling

Yes, I learned this the hard way. I understood that simplicity meant limitations, but I did not understand that simplicity meant danger until SQLite burned me.

If your perf tanks, you don't want to have to spend days putting timers all around someone else's codebase. Caveat: SQLite may be better these days -- my incident happened in 2013 -- but I spent more time tracking that one SQLite issue than I have spent spinning up postgres instances since then.


do you have any details about the situation you were facing?


My solution path for databases has been like this for a good decade:

  1) Sqlite
  2) Self-hosted Postgres
  3) Big Boy Database, with an $$$ cost. (AWS Aurora, Oracle, etc).
Most projects never leave the Sqlite level. Only one has left the Postgres level so far.


Im using SQLite for several personal projects as well, if you were to migrate to Postgres how would you go about it? Any tools/service you recommend?


I think a lot of us fall into the trap of expecting that our apps will grow to a huge size, and that we need to be ready to scale just in case.

This is where the "MongoDB is webscale" meme came from.

The truth is SQLite and a single webserver or Docker container will be fine for 95% of web applications.

People really underestimate the advantage of simplicity vs perceived power.

Use SQLite.


One thing that really excites me is concurrent writes -- I was poking around the project, and I've seen drh has been working on this for a bit now. [1] [2]

I believe the high level approach he's taking is essentially: 1. Concurrently execute the multiple write transactions in parallel. 2. Sequentially write the changed pages to the WAL. *[3] If a previous transaction causes the next to compute differently (conflict), then rerun that next transaction & then write.

The way to detect if were conflicts is essentially:

1. Keep track of all the b-tree pages accessed before running the transaction 2. Check the WAL if any previously transaction modified one of those b-trees. If so, this means we have to rerun our transaction.

I've seen it done in software transactional memory (STM) systems as well. It's really beautifully simple, but I think there are a lot of devils in the details.

[1] https://github.com/sqlite/sqlite/blob/9077e4652fd0691f45463e...

[2] https://github.com/sqlite/sqlite/compare/begin-concurrent

[3] * Write to the WAL, so that parallel transactions see a static snapshot of the world.


I wish more "self-hosted" open source projects would support sqlite out of the box. It's honestly a little ridiculous to, for example, stand up postgres for a 1 person blog on a personal domain. Or even a 10 person mastadon or pleroma instance or whatever.

That said, sqlite used 'badly' can be quite frustrating. Home Assistant, for example, is usually set up on an sd card in a raspi and then runs an sqlite database on it that it dumps massive amounts of very redundant data into as json blobs. Pretty common to have it just randomly lock up because the sd card has trouble with that frequency of writes.


I use it together with Rails and the horizontal sharding feature. Each customer has it's own sqlite database running in WAL mode. Since the app is internally used, traffic/writes are pretty predictable.

I also do backups periodically with ActiveJob using `.backup` on the sqlite3 client. It's simple and nice because I just have to worry about running the app, and nothing else.


A potential downside here could be running out of file descriptors if you use one database file per client. Linux has a limit of 1024 open files per process.

Did you ever run into this problem?


Would love to know more about how you set this up.


are you doing multi tenancy in your application by creating different SQLite database for each customer? I'm curious to know more about your approach


One other reason sqlite is great is the reduced latency. When you can satisfy queries in microseconds vs milliseconds, there is a fundamental shift in certain things you might try or not try.

We've been using this stuff in production for over half a decade now. Multi-user, heavily-concurrent systems too. The biggest cost savings so far has been the lack of having to screw with a separate database server per customer install (we do B2B software).


An interesting answer to your question from Tailscale a few weeks ago [1]: sqlite as the main database, many readers read a copy, the copies synced using litestream. [1] https://news.ycombinator.com/item?id=30883015&p=2


Yes, our clustered, real-time configuration file system uses sqlite as sole backing store.

https://pve.proxmox.com/pve-docs/chapter-pmxcfs.html

https://git.proxmox.com/?p=pve-cluster.git;a=tree;f=data/src...

When it was written by our CTO over 10 years ago he tried every DB solution available, that is those that somewhat fit the picture, only sqlite survived any test thrown at them, if setup as documented it handles a pulling the power plug in any situation, at least in our experience.

It may need to be noted that the DBs are only used locally, we synchronize commits ourselves via a distributed FSM, that's mostly transforming the Extended Virtual Synchrony corosync provides to simple Virtual Synchrony.


I've used in "production" and as the "primary datastore", but not in the ways those terms are normally used.

1. PHP web development for the client of a client. They needed persistent data and MySQL was not available. Moving to a different webhost was straight up rejected. Used sqlite with Idiorm and it worked just fine.

2. As the local datastore for a cross platform mobile application. The sqlite DB was unique on each device. Libraries were available and worked well.

3. This is a large one. Several 10's of thousands of installs that query the filesystem, but filesystem access is throttled by the vendor. We're using sqlite to store the state of the filesystem as it doesn't really change that much. If the db is damaged or whatever, it can be wiped as it isn't the final source of truth.


I no longer work there, but an enterprise facial recognition system used by NGOs, and 3-lettered government agencies has SQLite as the sole datastore. I wrote a portion of the SQLite runtime logic, a simply key/value store used all over the software.

SQLite proved to be phenomenal. We spec'ed hardware with enough RAM to hold the FR DB in memory, and damn SQLite is fast enough to keep up with the optimized FR system performing 24M face compares per second. With a 700M face training set, SQLite also proved instrumental in reducing the training time significantly. These daze, if given the opportunity to choose a DB I always choose SQLite. I use SQLite for my personal projects, and I go out of my way to not use MySQL because SQLite is so much faster.


Impressive numbers, thanks for sharing.

Out of interest, were you running on bare metal/cloud? And what kind of CPU was behind those 24M face compares per second?


Running on bare metal, and those numbers come from a 3.4 GHz i9. The system is a fully integrated single executable, with embedded SQLite. Since I left the firm a year ago, new optimizations have the facial compares down to 40 nanoseconds per face.


I'm using for a suite of commercial desktop products and it's working out really well. You'll need to figure your multiple reader/single write connection pools, and graceful shutdowns in your custom server to avoid a data file corruption. This stuff you wouldn't normally do with a db server, but the discovery has made for some great learning and provided food for all kinds of load balancing and distributed db designs. Also started using fossil, which I really love for my small team.

Sqlite is one of the greatest open source projects in history, with awesome docs, and really is a tribute to the art of programming. I'm happy and honored to use it for the appropriate use cases (which are a lot more than one would think).


BTW, the only corruption I've run into with sqlite is plug-pullers with transaction frames still open, killing a debug session, the usual shenanigans that are improper with basic NTFS.


At my current company we deploy sqlite as the primary and only database for our server. Our use case is a little less impressive than your usual webscale startups though.

Our product is a self-hosted IoT & hub unit solution, so we have no requirements to work with thousands of users doing who knows what. For our use case, sqlite is perfect. We don’t need to work with millions of rows, don’t need to stress the relatively low-power server units with another long lived network process, have no requirements of authentication since the user owns all the data, and can easily get insights into the database both during development and during troubleshooting at customer locations.

I’d sooner leave the project than move to anything else.


We've used it in cloud migrations of light SQL Server workflows which were previously run on shared servers.

We replaced SSMS + SQL Server with Python + SQLite run in AWS Lambda. The jobs fetch the database from S3, update with the latest deltas and write out the database and some CSV files to S3. The CSV files drive some Tableau dashboards through Athena.

The SQL usually needs a bit of a rework to make this work, but for the volumes of data we were looking at (we're talking less than a million rows, jobs run once per day) we've seen good performance at low cost. We used DuckDB for a couple of workloads which needed more complicated queries, it's stupid quick.


I use Sqlite for all my projects, but most of my projects are Windows applications. At prior employers I did work on web applications that used traditional server-based databases.

In my opinion the biggest thing separating Sqlite from a "full blown" database is actually Sqlite's lack of stored procedures. At all of the places where I worked with traditional databases, we used stored procedures to create an ersatz data access abstraction so that the database design could vary independently of the API presented to the application. With Sqlite I find myself (ab)using views as a poor man's stored procedure, but of course that only covers the read-only or "functional" (in the functional programming sense) portion of stored procedure code.

Everything other commenters have said about data size or centralization also applies, but for me (again, just personal opinion) I'd actually draw the line at the point where you can or cannot get by without stored procedures. From an operational standpoint that would be: at what point is it imperative to be able to vary the details of the database design while maintaining an abstraction layer (stored procedures) that allows application code to be blissfully unaware anything changed underneath it?

Examples of when that would be needed would be if new users + applications start having competing needs, or if you need to revamp your table structure to improve performance or get around a limitation. If you're in a startup or small company, it would be the point at when you find yourselves hiring a real Database Administrator (DBA) rather than giving DBA duties to developers. Prior to that organizational scale you may be better off with the simplicity of Sqlite; after reaching that level of organizational complexity you might need a "real" (server-based) database.


Question for people using SQLite in prod: how do you cope if your app is running on a platform like Heroku or Cloud Run, rather than a proper server or VM? Have you found a solution for the fact that those environments, and disk, is ephemeral?


Yeah this is an annoying problem. I just run an actual server rather than an "app" platform. I haven't found it that complicated and I pay less monthly which is nice.


Postgress

About 2 years ago I wrote my own blog engine so I could get up-to-speed with NodeJS: https://andrewrondeau.herokuapp.com/

I would have loved to do SQLite with some kind of "magic" backup, as SQLite is more than enough to handle a personal blog. (It certainly would make development easier!) However, at the time Heroku only offered Postgress.


I'd always opt to use their hosted database in those constraints. IMO SQLite only works well if you own the storage, so bare metal (or VMs) only.


You could use Render or a similar service that offers persistent storage. I imagine Heroku has a similar feature.

[0] https://render-web.onrender.com/docs/disks



I use it for https://chaitinschool.org/ but it's a fairly small web app and minimal traffic. It's nice to move the data around easily (which is mostly workshop/event data) but if we have more people sign up I might switch to Postres.


Yes. For http://ht3.org which is a search engine I wrote for tech related articles. It works really well. It uses the fts5 extension, that allows full text searching. There are over a million indexed pages and it’s no trouble.


I'm also using fts5 for some small projects but i haven't looked too deeply into it so i'm wondering if you have any interesting insights. Like what kind of index/options do you use? Maybe the trigram index? And your "across boundaries" mode is just word* in fts syntax?


Yes. Acute observation. It’s actually two separate indexes. A trigram index for sub word searching. Word* is exactly right and the other is a Unicode61 index with porter stemming (also remove diacritics). Trigram tends to work well with abbreviations and such. Whereas porter works well for general searches.


Interesting. I might play around using two indexes and try to combine both search results. This could be nice for my less technical users.


> The irritant-free web

This is a very respectable goal. I wish you great success!

Good full text search without pulling in another dependency would be quite a win. I'll add fts5 to my reading list. :)

Out of interest, what kind of compute and storage resources do you have underneath that?


Thank you for the appreciation :)

It’s a linode. Shared cpu Plan.

1 CPU Core 50 GB Storage 2 GB RAM

It’s just $10 per month.

With linode even shared cpus are powerful and I’m yet to hit any overload. I’m sure it will at some point, I might upgrade then.


Slightly off-topic, because we use PostgreSQL on the backend, but because StoryArk is offline-first, we heavily rely on SQLite in our mobile app. The backend database is mostly just there to backup your local data and to sync it across all of your devices. So there aren't that many queries being run on the backend database.

I can't really count how many times I've been pleasantly surprised by how extensive the feature set of SQLite is. I mean, it even has window functions (https://www.sqlite.org/windowfunctions.html). And being able to quickly open up the app's SQLite file in a database browser is also quite helpful during development.


Yes, multiple times. It went/is going great!

Pros:

- A single API server, no separate database to worry about, configure, and update.

- Backups are as simple as backing up one file every so often. SQLite even has an API to do this from a live connection.

- Handles way more concurrent users than we’ve ever needed.

- Dev and test environments are trivial and fast.

- Plenty of tools for inspecting and analysing the data.

Cons:

- There are certainly use cases it won’t scale to, or at least not without a bunch of work, but in my experience those are less than 1% of projects. YMMV.

- The type system (even with the newish stricter option) has nothing on Postgres. I realise this is basically a non-goal but I’d seriously love to somehow combine the two and get PG’s typing in a fast, single file embedded DB library.

- Postgres JSON support is also better/nicer IMO.


I think SQLite vs. PostgreSQL is similar to Flask+SQLAlchemy vs. Django, or similar debates.

Yeah, you probably can do everything with the "simpler" stack. It might even be nominally faster in many cases. But if there's any chance you're going end up rolling your own type-validation or ORM or admin interface or GIS... Just use the battle-tested kitchen sink from the get go.


I have used SQLite for Django applications with a few thousand users. It has had no problems. However, I just use the ORM and never configure the SQL directly. The vast majority of LAMP stack style web applications would be an ideal use case.

However, I would consider how important RDMS features are to you which are not available in SQLite:

- less sophisticated type and constraint system.

- a severely limited ALTER TABLE.

- No stored procedures.

- limited selection of math and statistical functions.

- no permission and user model, not to mention row-level security.

To be clear, I don't think it's bad the SQLIte doesn't try to be an RDMS, but I would consider this perspective when making a decision, not performance which is great, and difficult to max out.


SQLite has a pretty powerful permission model via the set_authorizer hook - you can register a callback function which will be called each time SQLite attempts to read data. It's not widely used though from what I've seen.

It's also really easy to add new custom SQL functions to a SQLite connection, which means the missing math functions shouldn't be a limitation. Here's an extension for example: https://github.com/nalgeon/sqlite-stats


I would add "limited json support" to this list.

Under the hood, SQLite treats json as strings, you have to do some strange stuff with extract and computed fields indexing to index into it, which can be a bit fragile.

My use case these days is hybrid rdbms / nosql, where most of my tables have defined columns for frequently queried data, and a jsonb "data" field for everything else.

Postgres has impressive jsonb capabilities, and with 14 the index operators making querying it a dream.

I love SQLite, but postgres' jsonb handing makes the additional operational overhead worth it to me.


https://internetdb.shodan.io is powered by a SQLite database and gets millions of requests a month. It does require a different workflow and custom synchronization scripts but otherwise it's performed well.


What kind of workflow and synchronization scripts?

These sound like the kind of hidden costs that could turn sqlite's simplicity quite complicated if you don't see them coming.


+1, would love to know more about these!


I maintain an 'older' codebase (2012) and am rebuilding it to a new version, but both use SQLite. It's a configuration management web application installed onto either bare metal or virtual machines. Generally only a handful of simultaneous users; I want to say performance isn't much of an issue or concern, but I've had to fix a bug that was caused by too many writes to the database where the system ran into IOPS limits (traditional hard drives or constrained VMs at 100 IOPS).

There is a hacky solution for redundancy; at certain events, a copy of the .db file is made and rsynced to a secondary node. This will probably fall apart if the file ever goes above a few MB in size.

Pros / reasons to use it: Self-contained, just a single file to transfer, no drivers needed, no servers running other than my own application.

Cons: No good support for ALTER TABLE queries, so things like changing the name, datatype, or default value of a column isn't happening. The workaround is to create a new table and transfer rows over, then drop the old table and rename the new table. Also the aforementioned issue if you want redundancy.

So basically, if redundancy isn't a requirement for you, sqlite is fine. It's probably ideal for single user applications, like your browser or apps (iirc sqlite is used a lot for those purposes).


I wrote a command line tool that automates that ALTER table flow (create new table, copy data across, rename and drop old table) - you might find it useful: https://simonwillison.net/2020/Sep/23/sqlite-advanced-alter-...


My preferred production DB is PostgreSQL. However, for small experiments, SQLite is more versatile due to fewer dependencies, single binary, zero install overhead etc., so I use it often, in particular for research experiments and systems prototyping. The only thing that ever bothered me was the lack of type enforcement, which has since been improved.

Production uses: 0 (1 if my Ph.D. thesis code is included, which had some C++ code that linked against version 2 of the SQLite library).


I'm coming from a similar direction. Postgres is my go-to, and I love it's reliability when you get your schema right.

Glad to hear its type enforcement situation is improving.


For those who've missed the announcement, here are some past links on the topic:

    - https://www.sqlite.org/stricttables.html
    - https://news.ycombinator.com/item?id=28259104
    - https://news.ycombinator.com/item?id=29363054


a. I'm surprised no one has mentioned WAL2 + BEGIN TRANSACTION, both of which are in separate branches with the plan to be merged into main.

Even though SQLite can handle 99% of peoples use cases, WAL2 + BEGIN TRANSACTION will greatly close that last 1% gap.

b. Expensify has created a client/server database based on SQLite called https://bedrockdb.com and years ago it was scaling to 4M+ qps https://blog.expensify.com/2018/01/08/scaling-sqlite-to-4m-q...


Do you mean 'BEGIN CONCURRENT'?

https://sqlite.org/cgi/src/doc/begin-concurrent-pnu-wal2/doc...

Where did you see that the plan is to bring those into the mainline distribution?


Whoops, yes - meant BEGIN CONCURRENT (I can't update my original post).


I wasn’t aware or a plan to merge wal2 and BEGIN CONCURRENT into main. That would be awesome. What is your source about this?


Many of the replies here attest to the simplicity and fast performance of SQLite particularly for serving pages or data. But how well does SQLite fare in concurrent write/insert situations?

Although SQLite is not designed for this type of scenario, this discussion higlights there's a strong demand for a concurrent client/server RDMS that is simple, performant and easy to deploy. PostgreSQL is powerful and feature-rich, but not simple or easy to delploy. Hence the appeal of SQLite.

For example, could SQLite power a discussion forum of moderate (or more) activity i.e. users posting comments? The Nim language forum is powered by SQLite, but activity in the forum is fairly low. [1]

Between the simplicity of SQLite and the complex, heavyweight that is PostgreSQL, there is a wide gap between these database opposites. It's a shame there is no concurrent RDMS to fill that gap.

(Note: Another poster mentions the concurrent Firebird RDMS as a possible alternative, but I haven't used it. [2])

[1] https://forum.nim-lang.org/

[2] https://firebirdsql.org/en/features/


Most SQLite writes take less than a ms, so the concurrent writes scenario isn't actually a big problem - writes can be applied via a queue.

Does your forum accept more than 1,000 writes per second? If not, SQLite should be fine.


> But how well does SQLite fare in concurrent write/insert situations?

It's pretty well-known that concurrent writes are SQLites weak point, and that if your application requires high numbers of writes, that it's not the proper solution.

The SQLite devs even acknowledge this:

> SQLite will normally work fine as the database backend to a website. But if the website is write-intensive or is so busy that it requires multiple servers, then consider using an enterprise-class client/server database engine instead of SQLite.

> [...] client/server database systems, because they have a long-running server process at hand to coordinate access, can usually handle far more write concurrency than SQLite ever will.

(https://www.sqlite.org/whentouse.html)


Not simple or easy to deploy? On a server, it's "apt-get install postgresql" or "yum install postgresql". In the cloud, you choose RDS on AWS, Google Cloud SQL, Heroku Postgres, DigitalOcean managed DB, etc. Need to step it up? AWS Aurora or Aurora Serverless.

Just because Postgres has all the features and knobs doesn't mean you have to use them or turn them. You can always use it like SQLite but just one or two things more that you need.


Things worked well at the outset, especially in local development against my NVMe drive for my small CRUD application.

Then, with a little traffic, things continued to go well in production. But as traffic scaled up (to 1-5 QPS, roughly 25% writes), they fell apart. Hard. Because my production environment was spinning rust, IO contention was a real issue and totally absent from development. This manifested as frequent database timeouts, both from reads and writes.

Echoing another commenter's sentiment: things would have gone much more smoothly from the beginning had I started with PostgreSQL, but after having written many thousands of lines of direct SQL taking intimate advantage of SQLite's surprisingly rich featureset, migrating was less than totally appealing.

The mitigation strategy, which ultimately worked out, was to implement backpressure for writes to SQLite: queuing and serializing all writes to each database in the application, failing loudly and conspicuously in the case of errors (thus forcing the client to retry), and gracefully handling the rare deadlock by crashing the process completely with a watchdog timer.


So you were using an HDD, not an SSD?

Would an SSD in production have solved the timeouts by increasing your write throughput?


According to my testing, SSDs would greatly reduce the contention but were not a complete fix.


This article could help a little: https://www.unixsheikh.com/articles/sqlite-the-only-database...

But you probably won't see it since at the time of writing my response there are already 172 comments.


We are using Sqlite as a primary datastorage for "file blobs" and implement a software based replication using our queue to multiple servers for redundancy.

The use case is storing trace/profiling data, where we use one sqlite file for each customer per day. This way its easy to implement retention based cleanup and also there is little contention in write locking. We store about 1 terrabyte of data over the course of 2 weeks this way.

Metadata is stored in Elasticsearch for querying the search results and then displaying a trace hits the Sqlite database. As looking at traces is a somewhat rare occurence we iterate over all fileservers and query them for trace data given an ID until we find the result.

Reference https://www.sqlite.org/fasterthanfs.html


For one project we used sqlite in a GUI application to store some tabular data (usually way below 1GB), and to do some light processing over it. Mind that the performance requirements were minimal, as this was a just an auxiliary function; the program did do a lot of heavy lifting in other parts, but the DB wasn't involved in these (it got fed some messages/stats at best, and most of the data came from other aux functions). The sqlite lib was easy to integrate both on the technical as well as on the legal level. We could have done all that with native code, too (I think we even removed some code?), but it would have consumed much more time (dev work, unit tests, maintenance) without any benefits. And it worked like a charm, except for one issue: The GUI person did create a connection for every operation but thought they were reusing it, which then caused some weirdness. And this was easily fixed.

An important realization is that not everything needs to scale, and that it depends on how you access the DB and what your product looks like. For a load with many concurrent writes I'd be careful with sqlite, or when I know that I'll want my DB to mostly live in memory (e.g. operations will often process the whole, huge dataset and no index can help with that). But even if I thought "Uh, I'll probably need a full DB", I'd still benchmark my application with both sqlite and e.g. postgres. And if the API to access the DB uses some nice abstractions, swapping the flavor of SQL isn't a huge issue anyway.

//edit: Plus, I've done stupid stuff like "my SPA hammers the PHP API with 20 to 40 requests, each resulting in a simple SQLite query, just to render a checklist" and got away with it: a) because we had at most 20 concurrent users [realistically: 1 to 5] b) doing the checklist took half a workday (ticking off an item was done via a JS callback in the background, so the actual rendering happened only once) and c) SQLite performs great for read heavy loads. The site performed so well (page loads felt about as fast as HN, even when connected via VPN) that I even scraped the plan to locally cache checklist in the HTML5 localStore (bonus: no cache = no cache incoherence to care about).


Yes! Sqlite works great in production for single threaded access it can handle nearly the same sort of performance as a mysql install, inserts are fast full text search is fast it all just works great.

If you have multiple threads accessing the same database it will kill the speed of sqlite completely, it will work for development but as soon as you put it into production and put any sort of threaded load on the database it will quickly become the bottle neck and bring the whole thing down. If you run into this threaded issue you can just switch to mysql at that point and it will fix the issue.


This isn't a typical use case. FWIW, a decade ago, we used sqlite as the persistence mechanism for an in-memory KV store called membase. (See https://github.com/membase/ep-engine). This was powering 50M+ DAU traffic in production for very intense write-heavy traffic. It did its job well. Around that time we also considered leveldb (and rocksdb a bit later) as alternative to sqlite.


I'm using SQLite for a small personal project that's live in production and so far I love it (both for its simplicity in development and for its performance).

But I've run into on prod that didn't exist in dev on my MacBook M1, and I'm curious if anyone has any suggestions:

My app is basically quiet and serves requests in the dozens (super easy to run on a tiny instance), but for a few hours a day it needs to run several million database transactions, N+1 queries, etc. Because of the high number of IOPS needed, a small instance falls down and runs suuuuuper sluggishly, so I've found myself needing to shut everything down, resize the instance to something with more CPUs, memory, and IOPS ($$$), doing the big batch, then scaling down again. That whole dance is a pain.

Were I using a more traditional postgres setup, I'd probably architect this differently -- the day-to-day stuff I'd run on Cloud Run and I'd spin up a separate beefy instance just for the daily batch job rather than resizing one instance up and down over and over again. The constraint here is that I have a 50GB+ sqlite db file that basically lives on local SSD.

Any thoughts?


Interesting problem. Did you try grouping up transactions? Ex instead of a few hundred million txns, do a few hundred thousand 1000op chunk txns. SQLite is much much faster within a txn.

Edit: a several hundred million txns over a few hours math. How many per second? ~500

According to here (question 19), for old HDDs you could expect 3 orders of magnitude improvement by using bigger txns. Not sure SSD wise but worth a shot.

https://www.sqlite.org/faq.html


I have not tried that yet (I was being lazy since it all ran so fast on my M1), but that's a good idea for something to investigate. That way I suppose I can run the whole thing on a less-beefy instance and avoid the scale-up/scale-down cycle.


Yeah I think if you are doing 1 txn per second roughly instead of 100s, it may just work fine. I'm curious how it goes so shoot me an email if you want with updates


Maybe use a disk that's "independent" on the VM? That way you can spin up a temporary beefy VM, attach the disk and get instant access to your DB.

Never tried, but I believe that's how these GCP disk work: https://cloud.google.com/compute/docs/disks#pdspecs


How do you deal with the downtime. I also find myself constantly resizing instances (although for data analysis where I sometimes need 80+ cores only for a few hours), and the constant restarting and reloading drives me crazy

Maybe for sqlite because it's stored in ssd, it's only 1-2 minutes to shutdown+change parameter+restart if it's scripted?


So far, I just deal with the annoyance. It is indeed only a <5 minute operation, but it's just a pain (shut services down, shut instance down, resize, start instance, start services).

The problem is it's the exact kind of pain that I avoid, so many times I just keep the larger (8-core) instance up and just eat the cost. I should probably script this to solve this pain point, but I'd SO rather have a setup that doesn't require this sort of jiggering.


Maybe `BEGIN CONCURRENT` [1] could help in your case? :thinking:

[1] https://www.sqlite.org/cgi/src/doc/begin-concurrent/doc/begi...


Interesting - I'll explore! Thanks.


VATcomply.com has been using SQLite as a primary database for 2 years and is serving ~70 requests per second without an issue.

HTTPS: https://vatcomply.com/ Github: https://github.com/madisvain/vatcomply


I used SQLite and come custom python for a client solution, a web application for teaching and thought process. It was a large application (many steps and user screens, some admin functions), probably a good fit for Django but I don't like Django. I used python sqlite interfaces, running on a single Debian base server. The server ran like a tank, no problems whatsoever, but the application had slow performance at times. The client did a complete re-write later with different consultants and they started over without SQLIte. In summary, the clients had no clue what SQLite is or why you would want it, and my efforts to explain the benefits in detail, did not sink in, while the slow performance was very important and caused them to dislike the entire product. All things considered, I would re-write it the same way, since I enjoyed the tech stack personally, but debugging those rough spots instead of adding a dozen more GUI features, would have been better for the project.


I had a great use case for SQLite last year.

Client (incident response dept at megacorp) had a problem: their quarterly exercises of switching network storage devices from live servers to disaster recovery (DR) servers was a manual operation of reconciling about 8 Excel spreadsheets and setting up ACLs before (luckily) an automated process would switch the storage mounts from live to DR.

We modeled and matched up all the hosts, servers, and ACLs and did a daily write to a single SQLite database. (We redundantly sent all the data to Splunk.) Now the DR employees are automating a daily diff of servers, hosts, ACLs etc to further automate the switch.

To echo a bunch of comments here, we decided on SQLite for a few reasons:

- only one user would write to the DB - only a few users need to access the data - besides standard retention policies, the data could be considered ephemeral and easily recompiled - the script we wrote to compile the data runs in 5 minutes, so if we lose the db, we can easily recompile it.

SQLite (and SQLalchemy) is useful for inexpensive data.


I'm a bit late to the party, but just in case it's useful to anyone: I'm using one DB per user for my CV tool on withoutdistractions.com

Since every user has a separate DB file, writes to those files don't block reads from the global DB file which contains everyones public data. As long as you keep your user DB schema the same as the global DB schema, it's pretty easy to sync records using a simple cron job.

More info on my tech stack here: https://withoutdistractions.com/cv/faq#technology

In case that link ever goes down, here's an archive link: https://web.archive.org/web/20220503102946/https://withoutdi...


I tried doing this a bunch of times and most of them ended up requiring a migration, most often to postgresql. I can only remember one such case that still uses sqlite for a web service.

It always goes like this:

1. I start a new "lean" web service and decide to use sqlite.

2. Some months down the road I figure I need some slightly more advanced db feature. The ones I can remember are postgresql numeric arrays (for performance where I can test for membership in a where clause) and jsonb (again with its special syntax for querying and its performance implications).

3. For some time I postpone the inevitable and do various hacks until I fully hate myself.

4. Suddenly realize that migration to postgresql will reduce the complexity, even with regards of infrastructure, as I usually have redis et al. in the game (which I wouldn't have to use had I started with postgresql initially).

3. I waste several days migrating and wondering was it (my initial stupidity) worth it...

My advise is - if it's going to be accessed via the network (and you'll have to operate a server either way), make it two servers and go with postgresql. If you are not 100% sure about the opposite (no chance of it becoming a web service), go with postgresql. Is it a desktop app? Postgresql (just slightly joking here). Mobile app? OK, I guess you have no real choice here, go with sqlite.

And no, you can't "just use an ORM", because when the day comes, you will need to migrate because of features sqlite does not support and you will have made mistakes. If you used an ORM, now you'll have to migrate off both sqlite and the ORM.

PS: Ah, yeah, and now I remember one other instance where I had to migrate off sqlite solely because I needed to provide an admin interface (think PGAdmin) to the production system.


I can't comment on my own use of SQLite as a primary database for anything (although the existence of SpatiaLite [1] may lead to me trying this out), but whoever needs an embedded database system should probably consider evaluating Firebird for that role as well -- it has an embedded mode with basically no feature compromises relative to the server mode. (They even put Interbase -- Firebird's ancestor -- in (not only) M1 Abrams' tactical data system apparently [2], for reasons of reliability.)

[1] https://www.gaia-gis.it/fossil/libspatialite/index

[2] http://web.archive.org/web/20190224100905/https://core.ac.uk...


Yes, works great for my sites (they are mostly read-heavy). I used to default to Postgresql, now I default to sqlite.

This [0] is a good article with some benchmarks, misconceptions about speed, and limitations.

[0]: https://blog.wesleyac.com/posts/consider-sqlite


> SQLite has essentially no support for live migrations, so you need to instead make a new table, copy the data from the old table into the new one, and switch over.

That seems like a pretty big flaw as your data grows. Zero downtime migrations are really nice. Anyone here got a war story / experience with this one?


I've used SQLite with HashBackup for 13 years and it's been awesome. A lot of the stories I read about "data corruption" with SQLite are, IMO, really about faulty applications that do database commits with the data in an inconsistent state from the application point of view. I've done it myself - it's an easy sin.

I've migrated database versions 35 times over the last 13 years, ie, every individual HB database has been migrated 35 times. You don't always need to make a new table, do a copy, and switch over. In the latest migration, I added new columns, initialized them, dropped columns, etc. without doing a db copy.

For this migration I wanted to switch to strict tables, where typing is strict. I could have done this by just altering the schema (it's just a bunch of text in the SQLite db) and then using SQL to make sure existing columns had the right data (using CAST). But instead, I created a general framework to allow me to migrate data from one schema to another, mainly so I could reorder columns if I wanted. That can't be done with ALTER statements, so I did end up doing a complete copy, but I've done many migrations without a copy.

I found this paper interesting on "zero downtime migrations".

https://postgres.ai/blog/20210923-zero-downtime-postgres-sch...

After reading it, the bottom line is that changes happen within transactions (true for SQLite too), and the key to zero downtime migrations is to use short transactions, use timeouts, and use retries on all database operations, including the migration commands. You can do all these with SQLite.


I am using it as the primary application file format for my desktop application.

With a lot of help from https://www.sqlite.org/appfileformat.html

It has had some pains but it has been great rather then flat file storage.


Yes, I've used it for a side project of mine. It processed like 5 financial transactions in total, so I'm glad I never invested the time to build anything more robust :)

It's also powering another one and I really like the fact that I can just commit the whole DB to the GIT repo.


Very interesting, what's your workflow? And how big is the DB?

I've never (deliberately) considered committing a DB to git. Although there was that one time when I was straight out of college...

Pro tip: surprising your colleagues in the morning with a 40 minute wait to pull master (because you committed a ???GB db) is a good way to feel like a right eegit.


It's very small, just ~2000 rows, it contains YouTube video ID-s and some metadata.

For backup, I have a small bash script that creates a git commit and pushes it to the github.


I'm using SQLite in Notion Backups (most of the workload happens in background jobs; the web app itself doesn't get that many visits)

Except for some rare exceptions, it's been doing pretty great. I don't have any plans to migrate from SQLite any time soon.


Any chance I could convince you to share some of those 'rare exceptions'? I love a good exception ;)

Also they're where the real insights are.


Lately, Rails has been complaining about connection timeouts, stating that all connections in the pool were in use (this usually happens when Sidekiq, a background jobs framework, processes multiple long-running jobs).

By default, the connection pool in Rails contains 5 connections, and they time out within 5 seconds.


I am using sqlite where a simple persistence layer is needed, both as the sole in the project or along with a full-fledged database. There are many such projects, once you realize that a database is just an abstraction; for example, for caching in a larger project, or to store results for a subsection of the project. But of course also for smaller, standalone projects.

Also, take a look at ws4sqlite (https://germ.gitbook.io/ws4sqlite/) for a middle ground between SQLite (embedded) and rqlite/dqlite: it's "normal" sqlite addressable via web services. May be useful in some scenarios.


I run a wiki hosting service at https://editthis.info and each wiki that is hosted uses SQLite. This is nice because it is easy to backup the dbs, everyone has there own separate db, and with over 250k wikis, most of them defunct, they don't really have any overhead other than disk space. They aren't part of one large database that you need to filter all queries through to get the data you want. Also, it makes it so if I want to shard the system, I can just take have of the dbs and move them to another server and it is trivial to do from a sysadmin point of view.


It's something I've been meaning to try for a long time. I had https://litestream.io/ in mind as a means to achieve a "Fully-replicated database".


Has anyone ever used multiple SQLite databases per tenant/account?

For sake of argument, let's say I have a fixed schema/format that will never change and I never need to aggregate queries across multiple customer accounts. Also, let's say writes to a single database are never going to be more than a hundred concurrent users. Why shouldn't I store each tenant's data in its own SQLite database? It makes it very easy for local client apps to download their data all at once. Backups are incredibly easy. Peer-to-peer synchronization behaves like a git repository merge. Why shouldn't I do this?


> a fixed schema/format that will never change

When would this happen for any non-trivial multi-tenant service? The difficultly of performing migrations sounds like it would pretty quickly negate any simplicity gained.


Definitely use case specific, as it wouldn't work generally, especially for domain driven data models. The schema I'm thinking of specifically represents a data model that is flexible enough where migrations are extremely rare[1].

Even if there are migrations, it's treated similar to a file conversion (like upgrading an older excel file format to a new format on demand when the file is accessed).

[1] Maybe something similar to https://www.notion.so/blog/data-model-behind-notion or an EAV model imbedded in a JSON1 column.


If the users are never aware of other users and they only care about their own data, sure. Maybe you have a service where a user can sign up to do a specific task that never requires interaction with another user.

However, ff you want interaction across users (messaging, user-roles etc.) then you might want to have them in one database.


Yes for sniprss.com. Backend is written in Go and using litestream for replication.

No issues at all.


How do you deploy your application? Is it possible to do without downtime?


I used it for a Magic: the Gathering Commander tool I made recently[0]. It was pretty useful, since the point of the tool was on-device card searching and deck organization. I was even able to sync the data between multiple computers just by putting it in a NextCloud folder.

Aside from some surprises regarding packaging it together with the rust crate and inability to rename columns, I'm really happy with it. Easier than deploying postgresql, more useful than documents.

[0] https://github.com/Endominus/Lieutenant


The development of SQLite as it matures into more and more production ready DB reminds me a lot of the story behind the GE J85 engine: https://en.wikipedia.org/wiki/General_Electric_J85

It started out life as a disposable engine for a decoy missile and so the engineers took a very lightweight approach to it and kept the costs down. It would later be adapted to be used for more permanent aircrafts and ended up being one of GE's most successful and longest serving engines.


I know a lot of people hate ORMs but I tend to find them useful, and if you do go down that route, it is pretty easy to write an application that will work with either a Postgres or SQlite backend.

I use SQLAlchemy and write applications where by just swapping out the database URI I can use either SQLite or Postgres. SQLite is nice for local development and easy testing, (you can even run tests using :memory: to accelerate CI/CD) and then I use hosted Postgres in prod. That said, based on what I have seen I would not be at all afraid to use SQLite in prod for internal tools etc.


Tenable.SC ( https://www.tenable.com/products/tenable-sc ) uses SQLite as the only type primary db.


Yes. For an interesting Desktop project in Java which needed to be fast. Using SQLite with the FTS extension, custom Java extension functions and also some BLOB tables with protobuf worked wonders.

I also built a Google Go library wrapping the sql amalgamation file and then cross compiled it for Android and iOS but with some more SQLite extension (GIS), which the stock Android/iOS SQLite did not have. This was some time in 2017 I guess.

I am a big fan of SQLite. You can integrate it in all kinds of stuff and adapt it to your needs. Compiling it is also straightforward.


Yes, I use it for builtwithdjango.com . I have not a single bad thing to say about it, not one!

It is so convenient to have it as a file, especially when you are just learning to do software development.

And the performance has not been an issue once.

One thing to note is that my site is not Facebook size. It only gets ~40 page views a day. And most of them are just for viewing, so no database opertaions.

So, I'm not going to be the most credible voice here. FWIW, I know that Pieter Levels, who runs multiple projects like nomadlist, remoteok, rebase, uses both SQLite and plain JSON files for storage.


When I was Syncplicity's desktop client lead, we used SQLite. (Syncplicity is a desktop file synchronization application.)

When I looked around, Dropbox used it too; and so did Bittorrent Sync (Now Resilio)


Used it in several Django projects, up to around 2M pageviews a day but most of that cached read.

I usually drop it because I need something that Postgres has or does better, or it's a write heavy site.


SQLite is a horrible database in comparison to Postgres, MSSQL, Mysql and so on. There is no reason to use it as a primary db. There are many other areas were SQLite really shines though.


I am using it in production but my product has (almost) no users. Hopefully it will grow. I plan to replace it with Postgres as needed, but it offers some interesting new ways to approach certain problems, and creates interesting new problems like how to scale across multiple app servers? It's so far just a file. I'm sure there is some networking solution. Developing locally is interesting since I can just copy the production database file right to my machine.


FWIW, i've never once started with sqlite and then later "upgrade" to another db. i once actually used sqlite to implement locking for a mysql db because mysql's locking requires (or required, back then) that the calling code specify, in advance, every table which would need locking, and that wasn't possible in that code base. So an sqlite db connection was opened just to act as a mutex for the mysql db in some code paths.


I use sqlite and litestream for all of my projects, deployment very simple, performance is great, litestream backups to backblaze b2 cost pennies per month


We adopted SQLite for a commercial Windows application suite. the architecture allowed users to export, backup and share databases (useful in the context). I've also used it for a large number of developer desktop utility apps. I'd consider it for web apps which have limited data throughput; I'm not convinced it's a good candidate for high data volumes (but open to changing my mind).


With C# (Entity Framework) I've tried and failed using sqlite in a production scenario because of concurrency issues...

Recently I stumbled over a potential fix[1], which I will try in my next project.

[1] https://ja.nsommer.dk/articles/thread-safe-async-sqlite3-ent...


Ouch, that sounds like a nasty bug. But if I understand correctly it's more the driver/ORM's problem than sqlite's?

One to look out/test for early if I go in this direction though. Thanks for the heads up!


> But if I understand correctly it's more the driver/ORM's problem than sqlite's?

I think so. sqlite is pretty reliable in my opinion and I never hat these issues anywhere else, but this bug made me switch my DMBS for a small side project from sqlite to postgres.

> One to look out/test for early if I go in this direction though.

You're welcome!


I love sqlite3. In python: https://docs.python.org/3/library/persistence.html

I have of course pickle -> open()

But eventually my projects grow large enough that sqlite3 becomes the database. I have never needed to go beyond sqlite3 in my projects. It does everything I ever need it to do.


I wrote a stock trading strategy backtester for a client in Go and SQLite and it read 100k rows/sec on a cheap consumer grade SATA SSD.


I'm working on a self-hosted analytics like Plausible but that allows to use SQLite as primary database. Of course it should be used for small websites and side projects but it will have all the Plausible features. Am I crazy?

This is the project: https://github.com/a-chris/faenz


I love SQLite and use it for a lot of my harebrained ideas to see if they'll take off. They never do, so I never out grow SQLite. :P


Currently using SQLite3 as the backend db of www.rtljobs.com.

I love it - very robust, lots of documentation, StackOverflow answers, example queries, etc.

On a typical day, I get less than 50 users per day globally, so I don't really have to worry much about concurrency or other issues that SQLite struggles with. I'd wager that many web applications are perfectly well served by it.


I on first learning of it like 15 years ago thought it was an amazing idea and converted my personal website to use it. The performance at the time was lackluster, and I shortly rolled back to MySQL.

My manager currently runs a number of personal sites with a SQLite backend and they all seem very performant so I have been honestly considering giving it a second look.


I use SQLite in production and it works great.

You should understand whichever RDBMS you use, and how to get the best performance out of it. Previously I used Postgres extensively, and it worked fine, and before that I managed MySQL servers. They are all fine, but SQLite is as simple as it gets, and more than adequate for most workloads.


Expensify are doing this to a certain extent and have written about it before https://news.ycombinator.com/item?id=23291779 There have been a lot discussions about this on HN over the years too.


Side question: what’s something as simple as SQLite, but more of an unstructured key-value store?

I’ve been using (locally) a Redis container for a very early prototype because it seems to be simple enough to use.

I know you can query json strings in salute but that’s not quite the same thing. For one redis offers some geo features.


Have you looked at RocksDB? http://rocksdb.org/


I just make key-value store tables and write a small interface to simplify access in H2 or SQLite.


But why not just use Redis (which is almost surely faster even vs. SQLite in-memory) instead of creating key-value tables?


Because then I need redis. Dependencies are icky.


LMDB. Previously, BDB.


I have a few low volume sites backed by SQLite, but perhaps not in the manner you're getting at. Mine are all sites that have a process to insert/update or read the data, then generate static html from it, so I don't need to deal with multiple simultaneous connections to it.


I use it as the data store for my company's Grafana instances. I back up the SQLite store to an S3 bucket with litestream.io, allowing me to treat the servers as 'cattle'. It has worked perfectly without any issues, and saved the cost of a full RDS instance on AWS.


I wrote a travel blog for a trip I'm currently on using Laravel and an SQLite database. The only people writing to it are my partner and I so that's not an issue. The CPU on my $5 VM would probably bottleneck trying to serve traffic before the database would.


I built a tiny desktop app around it, purely for myself. It was fine.

What I learned from that though is that I would never use it for actual business software, no matter the scale. The fact that it doesn't have proper timestamp support is enough by itself to be crippling.


so long as you are not trying to access the DB with more than one process, sqlite scales as far as most DBs on a single instance.

The only issue is that you'll need to take special care when backing up the DB file (but this is probably the same for most DBs even today.)


Note that this is specifically for write/modify operations. You can perform multiple SELECTs in parallel, so depending on the workload this may be acceptable. As the devs say: "Experience suggests that most applications need much less concurrency than their designers imagine."

https://sqlite.org/faq.html#q5


It's my understanding especially with WAL mode that it's fine to read and write from multiple processes (I hope so because I do this) it's just that any write locks the entire table.


I use sqlite for a flashcard app to help kids with multiplication. It's running on an ec2 micro instance, because I'm trying to see how long I can stay in the AWS free tier.

Everything is good so far, though most of my traffic is bots probing for wordpress flaws.


I do not quite understand the premises:

> Given the complexity

Which complexity? It is the simplest possible widespread, reliable and effective solution. Which makes it a primary choice.

> it seems like there are use cases or needs here that I'm not seeing

On the contrary, the use cases for the traditional Relational DB engines are defined: when you need a concurrency manager better than filesystem access. (Or maybe some unimplemented SQL function; or special features.) Otherwise, SQLite would be the natural primary candidate, given the above.

Edit:

I concur about https://blog.wesleyac.com/posts/consider-sqlite being a close to essential read if one has the poster's doubt.

To its "So, what's the catch?" section, I would add: SQLite does not implement the whole of SQL (things that come to mind on the spot are variables; the possibility of recursion was implemented only recently, etc).


> ... the possibility of recursion was implemented only recently, etc).

Noting that "recently" was August 2014 (version 3.8.6), according to https://sqlite.org/oldnews.html.

The "missing" right/full join types just hit trunk this past week and are still being debugged: https://sqlite.org/src/info/f766dff012af0ea3


> Noting that "recently" was August 2014

Right. This as an issue has little to do with deployment the way the submitter intended, and has to do with software that was implemented using specific static versions of SQLite and that some sometimes use as no better alternative meanwhile emerged. The "delay" is more evident (more "daily present") to said users, and has little to do with new products.

Nonetheless, given that, I would first check which subset of SQL you may need - I am not sure on how much overlapping you have with other mainstream products.


I worked at an old fintech and SQLite was the main db for all client data. One db per user.


I use it for a small site (high reads/writes ratio) where I keep different things in different databases (one for users, one for posts, etc).

Works perfectly well. Mind you, I would use Postgresql if the site were important, just to be on the safe side.


The old(ish) Cyanide implementation of Blood Bowl used SQLite as the data store for all of the game data. It worked, but the game startup time was absolutely dominated by loading data from the database and took too long even on a fast SSD.


I used to work for a mobile app company that made educational apps. We used SQLITE as our primary data store on the client and server-side. Our data was very small, think the question and answer content for flash cards. It worked great.


I'm using SQLite for my blogging engine (https://goblog.app). Easy backups, easy testing, light resource usage and probably faster than a separate database.


I'll plug https://github.com/mathaou/termdbms for people who need a terminal based SQLite management solution


Yes, via Bedrock (bedrockdb.com) for Streamie (streamieapp.com). No complaints.


I'm using SQLite as the primary database for PlotPanel [1]! It's going great with no unpleasant surprises along the way.

[1] https://plotpanel.com


I’d be curious to hear how you deploy new versions of your application with SQLite. Can it be done without downtime?


rqlite author here, happy to answer any questions. One thing I've noticed is a trend towards folks doing bitcoin mining (and related applications) wanting to use rqlite. I think they like that it is very easy to run, and gives them complete control over their data.

https://docs.google.com/presentation/d/1Q8lQgCaODlecHa2hS-Oe...


I only very briefly looked into rqlite. It's very interesting, but if I understand it correctly it's also not geared toward a write heavy workflow. (all writes are routed to the same node)

I.e. it's leaning more toward the moderate, but reliable writes, and heavy read use cases?

Please let me know ~if I'm missing anything~ what use cases I'm missing.


That's correct. rqlite replicates SQLite for fault-tolerance and high-availability, not for performance. It fact performance takes a hit, for the reasons you state. But it's no worse (nor better) than something like, say, etcd or Consul.

https://github.com/rqlite/rqlite/blob/master/DOC/FAQ.md#what...


Yes! I use it for https://webtoapp.design But its not really impressive as my DB is just 3 megabytes large haha.


Don't knock it - if it works and only costs 3MB that's a win.

The longer you can scale the product without having to scale the application, the better!


My workplace does. It works surprisingly well and has completely changed what I think about SQLite and databases in general. SQLite is more than enough for almost all production needs.


Yes, absolutely, and for many projects.

..but like all things, it depends on your needs. Some have already pointed out the pages on SQLite's on site regarding # of writers (the main issue), etc.


Yes. And probably the best characteristic of SQLite in production is its (albeit accidental or implied) story around Dev/Ops. A whole category of problems just goes away.


back in 2010-ish I ran a bootstrap/startup that was a community based writing platform for indie authors. Our writing app was entirely web-based, offline with CRDTs between browser storage and backend, where everyone's book was it's own sqlite DB. The forums ran on sqlite as did the auth system.. it worked really well for us (although we had to build a bit of logic around lazy updating schemas). I think it's well suited to user-partitioned data.


I use it for production backups. There's a sqlite db in each compressed archive to store metadata about the backup.

It's probably elsewhere but I don't realize it.


Yes, for a project used internally in my company, 300k rows per week. So far it’s going great. If things get too slow I’ll migrate to Postgres.


It depends on the application you are developing. If you are doing microservices, separate the database from the application is a must.


Some microservices will need persistent local data that is only used by the microservice and sqlite can be a good fit.


Are there any blog posts that details, soup-to-nuts, how to deploy a webapp with sqlite on something like AWS or GCP or Render?


You could provision an EC2 instance and mount some persistent storage. I'm not sure if it would make a lot of sense to take advantage of EKS or GKE.


For anything meant to self host its perfect(Except for things that should have been text. Text can be version controlled).


I think that iOS CoreData is just SQLite under the hood so I imagine a lot of iOS apps use it as the main data store


I am running www.bgtrain.com on sqlite.


I'm considering this right now!


not exactly - but I did do an iOS app that used CoreData, which has sqlite as its primary datastore. tricky because running ad-hoc sql queries on a sql database in an emulator or device was not straightforward...


Yes, it's great when you want to get started quickly on a single machine.


I used to use Sqlite for my website but no good for multi concurency requests


I've used sqlite at my previous job no problems whatsoever for 5 years


yes, used sqlite and failed miserably at multi-process access to the same db file. no DBMS on top of it, so to be expected. moved to postgres.


> ...I periodically hear about projects that use/have used sqlite as their sole datastore.

SQLite==exclusive access, no sharing, unless read-only.

Basically, it provides a SQL convenience for local usage.


If you need to safely have multiple processes read and write to the same data, it does that great. The writes are serialized, but that's typically how an in-memory shared resource would be implemented as well.

Do you mean shared across networks?


I have seen SQLite disconnect random sessions of multiple writers, rather than blocking them. Setting WAL mode made no difference.

This database can safely be used by a single writer at all times.

To implement the serialization that SQLite does not, do this in the POSIX shell:

  until mkdir ~/.dpapplock 2> /dev/null
  do sleep 3
  done

  trap "rmdir ~/.dbapplock" EXIT
  sqlite3_app ...
The mkdir() system call is defined as atomic by POSIX, so the shell can safely serialize for you, assuming there are never more than a handful.


My understanding is that SQLite suppprts only system locks. So multiple writers will need to be either blocking on system level or implement some other form of locking to ensure integrity.

A great deal of complexity of DBMS is in granularity of locks, its escalation/deescalation, and shared use performance.

I wonder if one day SQLite would support synchronization/replication protocol. In a way Fossil SCM is an attempt at SQLite replication, albeit a specialized one.


> A great deal of complexity of DBMS is in granularity of locks, its escalation/deescalation, and shared use performance.

Agreed.

> My understanding is that SQLite suppprts only system locks

Is this any different than a process blocking on a mutex while another writes to a shared resource? I understand DBMS do it better, but I don't see why it's viewed as a non-starter for SQLite.


Anyone using dqlite in production? (not IoT)


Grafana uses SQLite as its default database


How do you manage permissions with SQLLite?


I imagine, since SQLite does not connect over the network but is just part of the program, the permission model is "if the process is allowed to read the file, it can do things".

That would probably boil down to "code execution on the machine means access to the DB". And that sounds pretty reasonable to me.


isn't Bloomberg's internal database system built on top of SQLite?


Never really used it but Django seems to use SQLite by default. If you use python its worth checking out.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: