Hacker News new | past | comments | ask | show | jobs | submit login
Inserting a billion rows in SQLite under a minute (avi.im)
573 points by todsacerdoti on July 18, 2021 | hide | past | favorite | 146 comments



Hey all, author here. I didn't expect this to reach front page of HN. I came here to submit and it was here already! I am looking for more ideas to experiment. Here's one idea which someone from another forum gave me: exploring recursive queries and using SQLite random methods to do the insertions.

Another crazy idea is to learn about SQLite file format and just write the pages to disk.


I gave the suggestion on Twitter, but probably worth mentioning here too as people here might not be aware of it.

https://www.flamingspork.com/projects/libeatmydata/

libeatmydata shouldn't be used in production environments, generally speaking, as it biases for speed over safety (lib-eat-my-data), by disabling fsync, and associated commands for the running process under it. Disabling those commands results in less I/O pressure, but comes with the risk that the program thinks it has written safely and durably, and that may not be true. It essentially stops programs that are written to be crash proof, from being actually crash proof. Which under the circumstances you're operating on you almost certainly don't care.

I've used this when reloading database replicas from a dump from master before, as it drastically speeds up operations there.


Aside: SQLite has native functionality to do what libeatmydata does (disable fsync):

    pragma synchronous = off
https://www.sqlite.org/pragma.html#pragma_synchronous


Overlayfs offers libeatmydata functionality in recent linux kernels without having to intercept libc, so it should work for things that use raw syscalls too.


For people wondering when this is needed:

Go is one of the programming languages that makes syscalls directly, thus libeatmydata has no effect on it (being an LD_PRELOAD to the libc).


Also great for stuff like package/container builds where you're not worried about a consistent intermediate state— it either succeeds and you get the end result you want or it fails and you only care about logs.


You mention wanting to do a Go version. Not sure if it's useful, but this is a SQLite "bulk data generation" util I threw together ages ago in Go:

https://github.com/sqlitebrowser/sqlitedatagen

There's some initial work to parallelise it with goroutines here:

https://github.com/sqlitebrowser/sqlitedatagen/blob/multi_ro...

Didn't go very far down that track though, as the mostly single threaded nature of writes in SQLite seemed to prevent that from really achieving much. Well, I _think_ that's why it didn't really help. ;)


Very nice! I shall give that a try. I always get excited by SQLite and go projects.

I maintain a similar go tool for work, which I use to stuff around 1TB into MariaDB a time:

https://github.com/rcbensley/brimming


Potentially interesting:

https://github.com/siara-cc/sqlite_micro_logger_arduino

https://github.com/siara-cc/sqlite_micro_logger_arduino/blob...

This is a heavily subsetted implementation of SQLite3 that can read/write databases (presumably on SD cards) from very small microcontrollers.

It presumably doesn't have the same ACID compliance properties, but with a single <1.5k source file, may represent a particularly efficient way to rapidly learn the intrinsics if you happened to want to directly manipulate files on disk.

Now I'm thinking it could actually be interesting to see what drh thinks of this implementation (and any gotchas in it) because of its small size and accessibility.


I did similar Rust/Python record generation experiments, and I can relate to these numbers.

However, I think your Rust threaded trial might be a little bit off in subtle ways. I would truly expect it to perform about several times better than single threaded Rust and async Rust (async is generally slower on these workloads, but still faster than Python)

Edit : After reading your rust code, you might have room for some improvements :

- don’t rely on random, simply cycle through your values

- pre-allocate your vecs with ˋwith_capacityˋ

- dont use channels, prefer deques..

- ..or even better, don’t use synchronization primitives and open one connection per thread (not sure if it will work with sqlite?)


I am also interested in writing the SQLite or PostgreSQL file format straight to disk as a faster way to do ETL. I'd be curious to hear if anyone has actually tried this. I'd be as curious if you end up trying this too.


Sqlite has a btree module/api, though there's a lot of "TODO:" notes in the document: https://sqlite.org/btreemodule.html


TIL, this is great! I think this could make things even easier instead of writing the raw pages


I really like sqlite but I'm curious how fast it would be to write the data to parquet files and read it with duckdb. I've done something similar before and in my case duckdb was faster than sqlite.


yes, I plan to try this next and measure the performance. Others also suggested similar idea to try.

> I am also interested in writing the SQLite or PostgreSQL file format straight to disk as a faster way to do ETL.

what exactly you are trying to do here?



I'm curious how you'd fair if you pregenerated the data, with periodic COMMITs, and then just sling it at the sqlite3 binary over STDIN and see how it handled that.

Certainly if I was trying to do the same thing in Pg my first thought would be "batched COPY commands".


Why don't you create in-memory database (see SQLITE_OPEN_MEMORY, or ":memory:", or sqlite3_vfs_register) and copy result to a real file?


Possibly because:

> The machine I am using is MacBook Pro, 2019 (2.4 GHz Quad Core i5, 8GB, 256GB SSD, Big Sur 11.1)

Given the target schema, 100M rows with only 8GB RAM risks hitting swap hard.


It's a great article! Always nice to see clear well-written guides that save people time from reinventing the wheel.


You can go even faster if you can organize your problem such that you simply start from a copy of a template SQLite database each time (it is just a file/byte sequence).

We do SQL evaluation for a lot of business logic throughout our product, and we have found that starting from a template database (i.e. one with the schema predefined and canonical values populated) can save a lot of time when working in tight loops.


Are you suggesting have a template and then updating rows with random values where necessary?


You just do raw byte copies from sample DB, no SQL or "inserts" or anything similar. Imagine test database consists of 3 parts (all raw bytes)

### PROLOGUE ### Sample row ### EPILOGUE

You copy & write prologue, write 1B sample raws (can optimize this at will, large writes, etc)

Copy & write epilogue and fsync the data. You probably need to modify some metadata, but that should be a few writes at most.

That should be as good as it gets, providing your IO is optimal.


Do you mean crafting all the various database page btree structures and entries yourself? I'd be concerned about subtle bugs.


An SQLite database is just a file. You can build the empty database with schema and base values ahead of time, save it to a file (or an in-memory byte buffer) and then every time you want to create a new database, you just copy that file. No need to do any expensive initialization queries that way. If raw high-speed throughput is needed, skipping that step can make a significant difference.


Yes, that approach makes sense. I thought what I was replying to was suggesting writing b-tree pages themselves, outside of sqlite, for the new data.


I guess if you wanted the fastest creation you could make a custom backend format for sqlite and use that. Especially if query speed was not important.


Assuming SQLite has any internal counters or indexes, let alone B-trees or anything fancy, then this approach won't work. It will work for a raw record format (CSV, JSON, Protobuf, an mmapped array, etc.), but the author wants to actually interact with a real SQLite database. Generating a billion rows in some non-SQLite format still leaves problem to converting that format by loading it into SQLite, which isn't really a reduction of the original problem.


This idea is mentioned as one of the future work at the bottom.


Yes. Start from known-good database and then update or insert with needed deltas. You could even have more than you need in the template file and truncate/drop what is not relevant for the current context. Depending on the situation, it might be faster to start from a database containing all of the possible things.


This makes sense. Just do a copy or copy-on-write to an existing database file; you could even have the actual bytes of an empty (or pre-DDL'ed) sqlite file in memory in your app, rather than needing to do a disk copy.


Database optimization posts are always interesting, but it's really hard to do any apples to apples comparison. Your performance is going to depend mostly on your hardware, internal database settings and tunings, and OS level tunings. I'm glad this one included some insight into the SQLite settings disabled, but there's always going to be too many factors to easily compare this to your own setup.

For most SQL systems, the fastest way to do inserts is always just going to batched inserts. There's maybe some extra tricks to reduce network costs/optimize batches [0], but at it's core you are still essentially inserting into the table through the normal insert path. You can basically then only try and reduce the amount of work done on the DB side per insert, or optimize your OS for your workload.

Some other DB systems (more common in NoSQL) let you actually do real bulk loads [1] where you are writing direct(ish) database files and actually bypassing much of the normal write path.

[0] https://dev.mysql.com/doc/refman/5.7/en/insert-optimization.... [1] https://blog.cloudera.com/how-to-use-hbase-bulk-loading-and-...


Oracle, DB2, MySQL, SQL Server, and PostgreSQL all support bulk insert. Two obvious use cases are QA Databases and the L part of ETL which pretty much require it.


For relational databases in the enterprise with large amounts of data, batched inserts work great.

For large deletes it is often better to move the rows that won't be deleted to a new table and rename the table when done.

With large updates it is important to look at the query plan and optimize it with good indexes. Batching also works well in this scenario.


I thought one billion was 1000M (and not 100M)?


This is an important comment. It's not a typo in the article, even the source code only does 100M rows, not 1B. The author definitely does not hit the target of 1B rows in a minute.


The author doesn’t say a billion is 100 million. They say they’d like to be able to insert a billion, and say they’re able to insert 100 million. It’s not a contradiction.


Ok but that is a rather clickbaity title. The title makes it sound like they are successfully doing that.


Hey, sorry for the misleading title. I started with ‘Fast SQLite Inserts’ and it had many iterations. In the title, I wanted to intend that I want to insert 1 billion rows under a minute on my machine. I thought the current title is fine, since I got LGTM for earlier drafts. The detail about on my machine is also important since mine is a two year old laptop and all the measurements are done on it.

Also I got another feedback that title should indicate that it is a test database and emphasise that it is not durable.

I am wondering the right way to convey all of this in the title yet also keep it short.


>I am wondering the right way to convey all of this in the title yet also keep it short.

Add "trying":

"Trying to insert 1 billion rows in SQL in under a minute".

If anything it's more interesting because it implies the chance of failure.


Thanks for the explanation. I think it only needs a minor tweak. Maybe prefix with “trying to” or something like that. I am empathetic to the challenge of naming things concisely and accurately.


Hey, I read all the suggestions and prefixed the title with "Towards ...". I hope it makes it clear now


Attempting 1 Billion Row Inserts in under 1 Minute


Achieving 100m SQLite inserts in under a minute.


I hate hyperbole but i think the title is fine.


Perhaps "Towards Inserting 1B Rows in SQLite in Under a Minute" would be a better title.


Thanks for this suggestion and I have updated the title!


I agree. While the title reflects the eventual goal of the effort, the goal has yet to be achieved (and may or may not be achievable at all). I think it’s a bit irresponsible to use a title like that for a post that neglects to have achieved what was described in the title.


Perhaps "Working towards 1B rows in under a minute" would have been better.


Earning $10M/month with a saas


[flagged]


There’s no need to be snarky. I didn’t write the title, I’m just explaining what the author means.


I'm just memeing what it sounded like in my head, nothing personal


This depends a lot.

For English-speaking countries, 1B = 1M * 1k

For Spanish-speaking countries, 1B = 1M * 1M

For other languages it's a big "it varies", though the second definition seems to be the most common. The term "billion" is honestly, as ambiguous as using "06-03" for a date.

Also note that, historically, English also followed the second definition, so for old literature it's also confusing.


Exactly.

  thousand = 1000
  million  = 1000 * thousand (or 1000^2)
  billion  = 1000 * million (or 1000^3)
  trillion = 1000 * billion (or 1000^4)

(not to discount regional differences)

https://www.brainyquote.com/quotes/everett_dirksen_201172


1,000,000 million. In old money.



Yeah, that's my interpretation of a billion too. I vaguely recall that India or Britain interprets a billion differently though. Maybe that's what they're thinking?


I think it's the same in UK English as well. But in some (most?) European languages billion actually means 1 000 000 million (so a thousand times more). And we use "milliard" for 1000 million.


https://en.wikipedia.org/wiki/Long_and_short_scale for anyone that's interested in the differences


It looks like you're right. Just looked it up. A billion is 1million^2 in its etymology so English speakers are the odd one out.


He hasn't reached the goal yet, and is currently doing 100 million in ~34 seconds.

> Looking forward to discussions and/or collaborations with curious souls in my quest to generate a billion record SQLite DB quickly. If this sounds interesting to you, reach out to me on Twitter or submit a PR.


In Swedish, and I think in German and other languages too, 1 billion is 1e12. 1e9 is called milliard / miljard in Swedish.


Yes, in Germany too.


And in the Netherlands.


Also in Bulgaria


Just about everyone in the UK defines a billion to be 10^9. I've heard about the 10^12 definition but never encountered anyone who uses it - I think it must have been an older usage that fell out of favour.


In 1974, the UK government abandoned the long scale (million, milliard, billion, billiard) in favor of the short scale (million, billion, trillion, quadrillion) used in the US. The long scale is still used in languages like Dutch, German, and French.


And in 2001, The 51st State was released...

https://en.wikipedia.org/wiki/The_51st_State


INSERT INTO user (area, age, active) SELECT abs(random()) % 1000000, (abs(random()) % 3 + 1) * 5, abs(random()) % 2 FROM generate_series(1, 100000000, 1)

Faster by 10% than fastest author implementation on my machine - 19 seconds against 21 for 'threaded_batched'.


I have to assume this was rejected as a valid approach, but if not, this whole thread gets 10x more interesting than it already was. I hope the author responds.


Well, even if you just insert zeros instead of random values, it takes 9 seconds on my computer to insert 100M rows, so even that is not a 1B rows per minute.

And I think INSERT INTO ... SELECT is the fastest way to bulk insert data into sqlite.

Also, I have tried to use carray sqlite feature that allow to share memory with sqlite and use recursive CTE to query it, but it is slower. Though, you can pass values you've generated from Rust instead of using random().


This is with in memory database and journaling disabled?


My run on NixOS 21.05 with sqlite 3.35.2, without randomness, on an i7-7500U:

Invocation:

    command time sqlite3 ':memory:' '
    create table IF NOT EXISTS user
    (
        id INTEGER not null primary key,
        area CHAR(6),
        age INTEGER not null,
        active INTEGER not null
    );
    INSERT INTO user (area, age, active) SELECT 0, 1, 2 FROM generate_series(1, 100000000, 1);
    '
Result:

    16.34user 0.43system 0:16.89elapsed 99%CPU (0avgtext+0avgdata 1477320maxresident)k
    11inputs+0outputs (0major+369851minor)pagefaults 0swaps
Invocation with pragmas:

    command time sqlite3 ':memory:' '
    PRAGMA journal_mode = OFF;
    PRAGMA synchronous = 0;
    PRAGMA cache_size = 1000000;
    PRAGMA locking_mode = EXCLUSIVE;
    PRAGMA temp_store = MEMORY;
    create table IF NOT EXISTS user
    (
        id INTEGER not null primary key,
        area CHAR(6),
        age INTEGER not null,
        active INTEGER not null
    );
    INSERT INTO user (area, age, active) SELECT 0, 1, 2 FROM generate_series(1, 100000000, 1);
    '
Result with pragmas:

    17.31user 0.41system 0:17.85elapsed 99%CPU (0avgtext+0avgdata 1477288maxresident)k
    11inputs+0outputs (0major+369850minor)pagefaults 0swaps
As expected, the pragmas make no difference when using `:memory:` -- 17 seconds, 1.4 GB RAM each on my laptop.


I have used pragmas from the article.

Since it is only 100M rows, it takes 1.8 GB on the disk, so I've used tmpfs for this which essentially is a ramdisk. But I have a gen4 pcie nvme SSD - it can reliably write at 4GB/s sequentially, so writing takes a ~500ms for 100M rows, it is not a bottleneck here. random() takes ~half of the insert time. Generating those values with Rust, for example, is faster, but sharing this data with sqlite takes more time than generating it with random().

Maybe implementing custom virtual table in C or Rust like build-in generate_series, but the one that will produce user table fields will be faster, but that is significantly more effort than my query.

This query with random() and generate_series executed in sqlite CLI takes whooping 8MB of the RAM, so you don't even have to close all Electron-based applications to run it on a computer with 8GB of RAM.


Creator of RediSQL / zeeSQL (https://zeesql.com/)

Insertion performance on a single table are very very hard to optimize.

A single process looping it is your best bet.

I would just increase the batch size, which is the most influent factor.

Then another point... When you do batches, you do

    BEGIN TRANSACTION;
    for i in range(1, 50):
       execute_stmt
    COMMIT;
You do not create a long list of parameters.

https://github.com/avinassh/fast-sqlite3-inserts/blob/master...

;)


That's a great point, let me try it!

> You do not create a long list of parameters.

I have done much worse by trying to insert a really long string of 100K params


Even though merging strings seems faster...


How long does it take when using the native CSV import features?

https://www.sqlite.org/csv.html


It’s fast. But then we need a “1 billion CSV lines” article.


Which I expect is significantly faster to generate. You could probably combine it with a pipe to avoid all disk IO.


This is what I did the last time I ran into SQLite being slow with inserts.

I just shat out a CSV file and then imported it. It was much quicker! (like 1000x faster)

This article has some useful notes for me now to try the next time I play with my pet projects. :)


I built https://superintendent.app and experimented a lot with this feature.

I can import 1GB CSV file in 10 seconds on my MacBook. This queries from a virtual table and puts it in actual table


Interesting!

I was actually just working on SQLite speed for Objective-S (http://objective.st), partly as a driver for getting some of the more glaring inefficiencies out.

Using a "to do list" schema, I currently get the 100M rows out in 56 seconds, which is around half the speed of the Rust example given here, but 3 times the speed of PyPy and almost 10x faster than Python.

This is from an interpreted script that not only does the inserts and creates the objects to insert in the first place, but also defines the actual class.

The lower-level code is written in Objective-C, like the rest of Objective-S.

Class definition:

    class Task {
      var <int> id.
      var <bool> done.
      var <NSString> title.
      -description { "<Task: title: {this:title} done: {this:done}>". }
      +sqlForCreate {
          '( [id] INTEGER PRIMARY KEY, [title] NVARCHAR(220) NOT NULL, [done] INTEGER );'.
      }
    }.

Code to insert a computed array of tasks 10 times:

     1 to:10 do: {
         this:tasksTable insert:taskList.
     }.


If it's single, in memory table, is there really need to use database? Won't language provided data structures suffice?


It is a good question, but I'm guessing that the whole reason for creating the database in the first place is to test interactions with the database, in which case you'd have to mock up all the interactions being tested (selects, deletes, joins, etc.).


There are some domains like building cluster managers where having a relational view of the cluster state and being able to query/manipulate it declaratively has significant perks. See for example: https://www.usenix.org/system/files/osdi20-suresh.pdf (disclaimer: I'm an author)


Yes, if you want to run non-trivial queries on that data.

Although, frankly, SQLite would not be my choice.


I can't think of a single case where in-memory database is a good option, aside playing w/ SQL.

Whatever queries that might be a hashmap/tree/skiplist, etc. would be a lot better.


Sqlite provides a full text search module (FTS5) which provides a lot of nice search features. I have used this numerous times to build serverless apis for large static datasets for frontend analytical experiences like data tables without setting up elasticsearch. Thats one use case.

Another is they support this closures.c extension which is very nice for rapid queries on tree structured data locally in memory. The JSON1 extension is also nice for rapidly querying/reshaping deeply nested json data. Theres also spellfix1 that can provide fuzzing capabilities. If you need any of these with low latency and in memory its a great choice.

Sqlite is great for rapidly building low latency static data serving services for frontend experiences. Something I’m exploring now is combining sqlite with streamlit to rapidly build data exploration UIs.

Like how many times have you wanted to quickly add fuzzy matching or full text search to some program? You use fuzzywuzzy but its pretty slow, sqlite provides performant implementations of this stuff thats super simple to set up.


What do you think a relational database does to provide you the querying flexibility? It builds the same exact data structures, but does it automatically and dynamically, while offering much more usability.

Most attempts to query using raw data structures just means you end up rebuilding a (very poor) relational database with none of the features.


If the set of queries you want to run is fixed you can probably beat SQLite with a lot of work, if the queries are dynamic I doubt that you can do much better than SQLite without herculean efforts. A in-memory database is thus a good option if you either don't care too much about the runtime of your queries and you want to save a bunch of development time, or if you don't know enough about your workload to be able to beat a general purpose solution.


>you can =probably= beat SQLite with =a lot of= work,

Actually, I am beyond certain. When it's all about the memory no database comes even remotely close to a properly picked datastructures + structure/objects layout.

If I need transaction log + persistence, databases have a decent application.

In more than 20y, I have never had a case: Yay, I can use relation structures in memory b/c I don't know what I am going to do with the data.


For the case where I want the ability to run one-off analytics queries, having access to the usual set of join/filter/summarize/etc. operations is a lot more convenient and less error-prone than having to manually write them. But dplyr [1] is my go-to rather than SQLite personally for in-memory data sizes.

[1] https://dplyr.tidyverse.org/


For something like redis, but with more structure.

But a pretty good use case (IMO) is testing. If you want to do an integration test with an SQL database, and you want to test large numbers, this might be a good fully functional stub to run locally.


>For something like redis,

Redis is in pretty much the same category. Testing is sort of a valid case, if you are committed to write pure SQL with minimal use of any dialect specifics (but even 'create table' syntax might be different). Running on the real thing is close to no replacement when it comes to databases.

Many databases have docker images nowadays, so it aint hard to run them locally. Likely at a point you'd want to optimize the SQL, itself, rendering the tests incompatible.


It's very likely it wouldn't. (Decent) analytical DBMSes have a whole lot up their sleeves which just choosing one plain-vanilla data structure for your table doesn't even approach in terms of performance. The benefit may be well upwards of 10x in many real-world scenario.

Of course, if you know you get one single query which you know in advance, carefully build a data structure to cater just to that, and you know your data structures beyond the just the basics - then, yes, a DBMS would be overkill. But it still won't be a walk in the park.


What would be your choice?


If you don't do any joins, Clikhouse is worth a try. Otherwise, MonetDB (or even MonetDB embedded) if you want FOSS. Commercial offerings - Actian Vector, or maybe HyperDB. The big corporate ones are probably pretty good, but it's always difficult to tell because there's this taboo on any proper benchmarking against them.

If you just want to browse what other people are using, and not go by the recommendations of a random commenter, try:

https://db-engines.com/en/ranking

but note that's a joint ranking both for transaction-focused and analytics-focused DBMSes.

If it's time series data there are some more specialized offerings and I'm (even) less of an expert there.


Ah, I forgot a newer option: DuckDB. It self-styles as "SQLite for analytics"... https://duckdb.org/


For the table you have 3 integer rows and a char(6) row. Taking integer as 32 bit that is a total of 18 bytes of data per row.

At billion rows, that is 18GB of data. With some overhead for storing page info, let’s call it 20GB flat.

A modern SSD can deliver ~500MB/s write speed. That means writing 20GB of data can be done in 40 seconds.

Therefore a billion rows in a minute is quite plausible. At-least not bottlenecked by disk speed (if we can saturate disk).


It's a good insight to know what the hardware is capable of.

The SSD in the authors machine can do 1300MB/s and the latest M1 model can do 2100MB/s.

For current gen SSDs with PCIe 4.0, that number increases to 6600MB/s for the Sabrent Rocket 4 Plus.

That would mean just around ~5 seconds for writing to disk.

https://www.pcgamer.com/best-pcie-4-ssd-for-gaming/


> A modern SSD can deliver ~500MB/s write speed.

More like ~5GB/s.


If you insert 4 rows into a table, and join it on itself without any conditional criteria, it will result in a cartesian product of 16 rows (Select * from a, a)

Do that cartesian join three more times, and you have over 4 billion results.

Then you simply need to use the random function in conjunction with division, rounding and case statements to get the desired random numbers.


Yep - I discovered most of this a while back when I needed to do a bunch of exploring of ~200m rows of data (like 50GB). I chucked it into SQLite and was done with everything in a day, it was absolutely wonderful. This post is a pretty good summary of what to change and why.

Prepared statements should be considered the norm for SQLite - they have pretty major performance benefits, and any decent ORM or query engine can probably do it for you implicitly or with a single flag, so it's practically free for many applications. I'm always surprised how often I see applications or benchmarks or "how to be fast with SQLite" not using them, so I'm definitely glad to see them covered here :)


I've been dealing with lots of data and SQLite and my outtakes are:

- language has very little to do since the bottleneck will most likely be the way you insert data

- indeed prepared statements are useful but the performance didn't change much when I did long transactions and commit every certain amount of thousand of rows

- having lots of rows in your table is good but certain queries, like aggregation over many rows, are not what SQLite is great about.

- ClickHouse can easily ingest that and more in a laptop without even any scripting language.


I ran this and got the following results:

      /fast-sqlite3-inserts (master)> time make busy-rust
      
      Sun Jul 18 17:04:59 UTC 2021 [RUST] busy.rs (100_000_000) iterations
      
      real 0m9.816s
      user 0m9.380s
      sys 0m0.433s
      
      ________________________________________________________
      Executed in    9.92 secs    fish           external
      usr time    9.43 secs    0.20 millis    9.43 secs
      sys time    0.47 secs    1.07 millis    0.47 secs
      
      
      fast-sqlite3-inserts (master)> time make busy-rust-thread
      
      Sun Jul 18 17:04:48 UTC 2021 [RUST] threaded_busy.rs (100_000_000) iterations
      
      real 0m2.104s
      user 0m13.640s
      sys 0m0.724s
      
      ________________________________________________________
      Executed in    2.33 secs    fish           external
      usr time   13.68 secs    0.20 millis   13.68 secs
      sys time    0.78 secs    1.18 millis    0.78 secs
I'm probably doing something wrong. Or I'm getting the pace needed for the billion?

This is on a M1 MacBook Air.


Really good article, I’ve experimented with this kind of workloads in SQLite a few times and found it insightful. One note though, using:

  pragma temp_store = memory;
only affects temporary tables and indices, not the main database itself[0]

[0] https://sqlite.org/pragma.html#pragma_temp_store


Here is an example using Tcl to create an SQL script using recursive common table expressions:

# createSQL.tcl

set increment 100000

set loop [expr {100000000 / $increment}]

puts {PRAGMA journal_mode = OFF; PRAGMA synchronous = 0; PRAGMA cache_size = 1000000; PRAGMA locking_mode = EXCLUSIVE; PRAGMA temp_store = MEMORY;

CREATE TABLE user( pk INTEGER PRIMARY KEY, area INTEGER, age INTEGER, active INTEGER );}

for {set i 0} {$i < $loop} {incr i} { puts " WITH RECURSIVE tmp(pk,area,age,active) AS ( SELECT [expr {$i * $increment}], 500001, 5, 0 UNION ALL SELECT pk+1, 10000*(abs(random())%9+1)+abs(random())%100000, 5*(abs(random()%3)+1), abs(random()%2) FROM tmp WHERE pk < [expr {($i+1)*$increment-1}] ) INSERT INTO user SELECT * FROM tmp;" }

#EOF

Core 2 Duo 2.53 GHz, 4GB RAM, SSD:

$ time -p tclsh createsql.tcl | sqlite3 test.db

off

exclusive

real 213.37

user 208.10

sys 4.58


I'm pretty new to this sort of optimization stuff, so forgive me if this is ridiculous for any reason, but I'm wondering about two things reading this:

1: is it useful / feasible to set a sort of "lower bound" to these optimizations by profiling the raw write time on a set of a certain size?

2. assuming that writes are ultimately the limiting factor, could you gain performance by calculating the minimum batch size as it's write time intersects in-memory calculation, and as soon as you hit that size, pop a thread to write that batch to disk - then return some async signal when the write is done to trigger the next batch of writes? is it possible to "stitch together" these batches post-writes?

edit: mobile formatting is le suck


Mini summary: Do not use this in production. xD


I wonder if defining a virtual table (https://www.sqlite.org/vtab.html) and just doing

INSERT INTO t (...) SELECT ... from virtual_table

would be any faster.


From my armchair here I thought something similar, I think I heard 1) that inserting into the same table has some speed limitations and 2) one of the authors ideas was to spin up 4 processes…

I wonder if you could insert into 10 different tables from 10 threads or processes then

insert into single_table (…) select (…) union.

No idea if insert select does the trick or not but you’re almost at the point of partitioning here. If the application called for it you could do some sort of poor mans partition in order to write all the data and then push some complexity into the read.

I also wondered what if any impact the ID primary key had on inserting at the level of frequency.

/armchair


If you define the virtual table would you even need to do the `INSERT`? Just query the virtual table instead of inserting into `t` and querying `t`.


Just tried to insert 100M KV pairs of the same data format in a KV store. It takes about 26s on a Xeon Silver 4210, including fully writing down the data to the filesystem/SSD.

The source code can be found here: https://github.com/wuxb45/remixdb/blob/master/i100m.c

    $ make i100m.out libremixdb.so
    $ numactl -N 0 ./i100m.out
    insert time without sync: 24.865s, with sync: 25.801s


>The generated data would be random with following constraints: The area column would hold six digits area code (any six digits would do, no validation). The age would be any of 5, 10, or 15. The active column is either 0 or 1.

That is about 6 million combinations. This is not such a big space that it would be impractical to precompute all possible combinations. I wonder if hashing in to such a precomputed table might help. Might not if the overhead of maintaining the lookup table is too high.


Webapps (or webapp frameworks) like Nextcloud and Django, come, by default, with SQLite. That makes it easy to start, but they all warn you: When you get serious, get a real DB like MariaDB or PostgreSQL... And with stuff like this I wonder, is this really necessary?

I really like SQLite, it's well supported in Python, backup-restore is very simple. What is the real difference (probably application depended) between SQLite and a "real" DB engine?


This spells out where it’s appropriate and where it isn’t:

https://www.sqlite.org/whentouse.html


Thanx!


I guess the idea is you can scale the app independently of the DB. How do you scale out a single file? (Genuinely curious)


Hmm, never thought of this :) I come more from a selfhosting angle. And my django website is prototype only so far… But it is a point I never considered so thanx.

I wonder at what scales this starts to count (in human readable units ;))


I wonder if something like https://paulbradley.org/sqlite-test-data/ could be used to generate the test data in sqlite3 directly, and if it would be any faster.

(It would likely be faster for other DB engines, because there is network overhead in the communication between the program and the DB; no such things for sqlite).


Here's my solution : generate the 1B records .sqlite file once beforehand, then before running each test, copy it over the test DB file ;)


I would recommend looking into dtrace or other observability tools to figure out what to optimize next. So far you have basically had to guess where the slowness is and optimize that.

Lastly, how long does it take to make a copy of an already prepared 1b row SQLite file?that seems easier than generating a new one.


1 billion rows, 8GB RAM. I think there will be some limitation from available RAM as well.


There are some cool ideas in there!

I wonder how much I could speed up my test suites (that don't rely on transaction rollbacks) by disabling journalling. Those milliseconds per query add up to seconds per tests and minutes per PR.


> Threads / async may not be faster always

This is actually a little bit mind blowing for me. I'm gonna go and play with this. But what a cool read!


>Inserting One Billion Rows in SQLite Under A Minute

But the results given are all for inserting 0.1B rows, not 1B rows. What are the full results?


Multiply the reported number by 10


I bet, that this is mostly random.randint benchmark and not SQLite.

Also interesting whether batched numpy version would compare better to Rust.


Correction: random.randint + string interpolation


If it's a test database, can you simply generate it once, commit it, and use that in all future tests?


.. or the throughput of SQLite acting in pure fwrite(3) mode with a big buffer.


Is PRAGMA temp_store = MEMORY the same as putting the file on a TempFS?


I tried using tmpfs to create and exercise a file-based SQLite DB, and it performed significantly worse than a RAM-based DB (opened with an empty string or ':memory:').


Thank you :)


Maybe, maybe not. I imagine writing directly to memory will always be faster than writing to a filesystem, even if it’s ultimately backed by RAM. Might be a negligible difference though.


I've used the other pragma, but does

PRAGMA locking_mode = EXCLUSIVE;

improve performance much?


PRAGMA page_size 65536


But ensure this is actually in effect (e.g. do vacuum, etc.)


[ @dang, please delete if/when you get the chance, thank you ]


Did you mean to post this to https://news.ycombinator.com/item?id=27872596 ?


Yes. Not sure how I got mixed up.


>The larger DB required more than 30 minutes to generate. So I spent 30+ hours to reduce the 30 mins of running time :p

Worth it.

>I could just run s the script again.

Found a typo.


Cool


> Recently, I ran into a situation where I needed a test database with lots of rows and needed it fast. So I did what any programmer would do: wrote a Python script to generate the DB.

The author must have a rather low opinion of programmers... generating data with a script is good for 1,000 rows. Maybe 10,000 rows. 100,000 is probably stretching it. Beyond that the slowness will be meaningful.

Anyway, if the "test database" data is given in advance, you want an in-memory DBMS which can load data from files. If it isn't - use an in-memory DBMS which supports generated columns, or apply your random-number-generating function on a trivial column.

(MonetDB standalone/embedded would be a decent option; maybe the columnar storage mechanism in MariaDB? And of course there are commercial offerings.)

> Unfortunately, it was slow. Really slow. So I did what any programmer would do: went down the rabbit hole of learning more about SQLite...

That's the wrong approach. SQLite isn't called "lite" for nothing. It's not what you would use for bulk operations on large tables.

Read the SQLite FAQ: https://www.sqlite.org/whentouse.html

it oversells itself a bit, but still says its (only) advantages for data analysis are: "easier to install and use and the resulting database is a single file that can be written to a USB memory stick or emailed."


I suspect these opinions would be less likely to get downvoted if you could back them up with data, specifically:

- Assuming you do have a test database in advance that's represented in a file, but for some reason not a SQLite file or other database, how long does it take to load 100M rows in a database of your choice, and how does that compare to the author's results?

- The author does not in fact have a test database in advance, and is generating random data, which I think is what you're saying by "apply your random-number-generating function on a trivial column" - how long would your approach take to generate 100M rows?

- If you generated a database of about the same size in a database of your choice, how do bulk operations on large tables perform in comparison to SQLite? (My understanding is that it's "lite" because it's not client/server, not because its performance is poor.)

I don't think any of these experiments would be terribly hard to run.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: