> wow! awesome tips. I knew COPY rocks but didn't realize it would win vs INSERT...

> wow! awesome tips. I knew COPY rocks but didn't realize it would win vs INSERT INTO SELECT FROM !

We (postgres) should fix that at some point... The difference basically is that there's a dedicated path to insert many tuples at once that's often used by COPY that isn't used by INSERT INTO ... SELECT. The logic for determining when that optimization is correct (consider e.g. after-insert per-row triggers, the trigger invocation for row N may not yet see row N+1) is specific to COPY right now. We need to generalize it to be usable in more places.

To be fair, part of the reason the COPY approach is faster is that the generate_series() query actually uses a fair bit of CPU on its own, and the piped psql's lead to the data generation and data loading being run separately. Of course, partially paying for that by needing to serialize/deserialize the data and handling all the data in four processes.

When doing the COPYs separately to/from a file, it actually takes longer to generate the data than loading the data into an unlogged table.

  # COPY (SELECT (1000*random())::int2 as city, (random()*random()*1100)::int2 temp  FROM (SELECT generate_series(1,1e8::int8))) TO '/tmp/data.pgcopy' WITH BINARY;
  COPY 100000000
  Time: 21560.956 ms (00:21.561)

  # BEGIN;DROP TABLE IF EXISTS temps_int2; CREATE UNLOGGED TABLE temps_int2 (city int2 NOT NULL, temp int2 NOT NULL); COPY temps_int2 FROM '/tmp/data.pgcopy' WITH BINARY;COMMIT;

  BEGIN
  Time: 0.128 ms
  DROP TABLE
  Time: 0.752 ms
  CREATE TABLE
  Time: 0.609 ms
  COPY 100000000
  Time: 18874.010 ms (00:18.874)
  COMMIT
  Time: 229.650 ms

Loading into a logged table is a bit slower, at 20250.835 ms.

> Of course, if we really want to "go fast" then we want parallel loading, which means firing up N postgresql backends and each generate and write the data concurrently to different tables, then each compute a partial summary in N summary tables, and finally merge them together.

With PG >= 16, you need a fair bit of concurrency to hit bottlenecks due to multiple backends loading data into the same table with COPY. On my ~4 year old workstation I reach over 3GB/s, with a bit more work we can get higher. Before that the limit was a lot lower.

If I use large enough shared buffers so that IO does not become a bottleneck, I can load the 1e9 rows fairly quickly in parallel, using pgbench:

  c=20; psql -Xq -c "COPY (SELECT (1000*random())::int2 as city, (random()*random()*1100)::int2 temp  FROM (SELECT generate_series(1,1e6::int8))) TO '/tmp/data-1e6.pgcopy' WITH BINARY;" -c 'DROP TABLE IF EXISTS temps_int2; CREATE UNLOGGED TABLE temps_int2 (city int2 NOT NULL, temp int2 NOT NULL); ' && time pgbench -c$c -j$c -n -f <( echo "COPY temps_int2 FROM '/tmp/data-1e6.pgcopy' WITH BINARY;" ) -t $((1000/${c})) -P1

  real 0m26.486s

That's just 1.2GB/s, because the bottleneck is the per-row and per-field processing, due to their narrowness.

> The ultimate would be to hack into postgres and skip everything and just write the actual filesystem files in-place using knowledge of the file formats, then "wire in" these files to the database system tables. This normally gets hairy (e.g. TOAST) but with a simple table like this, it might be possible. This is a project I've always wanted to try.

I doubt that will ever be a good idea. For one, the row metadata contain transactional information, that'd be hard to create correctly outside of postgres. It'd also be too easy to cause issues with corrupted data.

However, there's a lot we could do to speed up data loading performance further. The parsing that COPY does, uhm, show signs of iterative development over decades. Absurdly enough, that's where the bottleneck most commonly is right now. I'm reasonably confident that there's at least 3-4x possible without going to particularly extreme lengths. I think there's also at least a not-too-hard 2x for the portion of loading loading data into the table.