Hacker News new | past | comments | ask | show | jobs | submit login
Setting up PostgreSQL for running integration tests (gajus.com)
133 points by mooreds 8 months ago | hide | past | favorite | 46 comments



There's one idea I'm not sure being described.

Basically you create docker container from some postgres image.

Then you run DDL scripts.

Then you stop this container and commit it as a new image.

And now you can create new container from this new image and use it for test. You can even parallelize tests by launching multiple containers. It should be fast enough thanks to docker overlay magic.

And it should work with any persistent solution, not just postgres.


You can save some time and complexity and just run a single container and first, once, set up your template database `my_template`, then create your testing databases using `CREATE DATABASE foo TEMPLATE my_template`. Basically TFA.

This will be much faster than restarting postgres a bunch of times, since this will just `cp` the database files on disk from the template to the new database.

The only "problem" is your application will need to switch to the new database name. You can also just put pgbouncer in front and let it solve that, if you want.


I think the idea is to get a clean DB for every test in your suite.

Your solution requires spinning up hundreds of docker images per test run..

We do the same thing as described with MS SQL; takes about 1 sec to get a fresh DB that way.

While MS SQL takes 30 seconds or something from Docker image.


I would think the fundamental issue with this is similar to what the author described with template databases:

> However, on its own, template databases are not fast enough for our use case. The time it takes to create a new database from a template database is still too high for running thousands of tests:

And then in the timing shows that this took about 2 seconds. Launching another container is surely going to be at least that slow, correct?

So it's clear the author is trying to get an "absolutely clean slate" for each of potentially many tests. That may not be what all teams need, but I will say we had an absolute beast of a time as we grew our test suite that, as we parallelized it, we would get random tests failures for tests stepping on each other's toes, so I really like the approach of starting with a totally clean template for each test.


Starting each test with a totally clean template ensures consistent reproducibility, I’ll give you that, but it also isn’t real world, either. You only have the data that the test seeds which favors tests (and by extension business logic) written only with the “happy path” in mind. I think the smell for when tests are stepping on each other causing flakey runs, is that the logic being tested isn’t written for non-happy paths, or, the tests are asserting entirely too specific datasets or outcomes and anything else results in a failure. In the real world, other systems may very well be touching the same data or tables that your code is interacting with, so it being able to handle that kind of situation will produce a more fault tolerant system overall, which will serve to deliver value even if other systems go haywire and produce unexpected data you are looking at. Of course the need to handle that extra complexity is going to vary depending on business needs or other determining factors.


This is absolutely the correct answer. Testing infra should be ephemeral and mostly stateless. Prior to docker you had to figure out ways to mock the database or use something to approximate it with a lite weight DB like H2 or SqlLite.

With docker you can build out the test image with default usernames/passwords/etc...

Then as your install gets more complicated with stored procedures and the like you can add them to your test database and update local testing tooling and CI/CD to use that.

The massive benefit here is that you're using the exact same code to power your tests as you use to power your production systems. This eliminates issues that are the caused by differences between prod & test environments and anyone who's debugged those issues know how long they can take because it can take a really long time to figure out that is where the issue lies.


You have to pay a lot of latency for startup, and Postgres has all the built in isolation mechanisms internally so it’s easier to use


I can +1 this approach and report a good amount of success with it using mysql.

it really enabled end to end level testing as well as being able to stand up development instances quickly.


This is a great start, but I’d recommend three more improvements:

- Follow this guide to disable all durability settings. We don’t care if the DB can recover from a crash, since it’s only test data: https://www.postgresql.org/docs/current/non-durability.html (I wouldn’t worry about unclogged tables, personally)

- Set wal_level=minimal, which requires max_wal_senders=0. This reduces the amount of data written to [mem]disk in the first place.

- The big one: create a volume in /var/run/postgresql/ and share it with your application so that you can connect over the Unix domain socket rather than TCP. This is substantially faster, especially when you create new connections per test (or per thread).


In Ruby on Rails there is a way to do multi-request integration tests with single PostgreSQL/MariaDB database where transactions are used. The trick is to keep and share just one connection to database.

With PostgreSQL at the beginning of the test the outer transaction is opened, then connection is shared with application, test steps are performed (including Selenium etc), requests open/close nested transactions etc. Once steps are done the test closes the outer transaction and the database is back to initial state for the next test. It is very simple and handled by the framework.

In fact it can be even more sophisticated. E.g. an integration test class can preload some data (fixtures) for all the test cases of the class. For that the another nested transaction is used to not repeat the data load process for every test case.

MariaDB doesn't have nested transactions, however in that case RoR uses SAVEPOINTs mechanism. https://mariadb.com/kb/en/savepoint/


Interesting, we've just been looking at something similar at work. Unfortunately we've got a bunch of application config and test data in the database and no fast process for seeding it. We're now building a Postgres Docker image with a baked-in database. There were some tricks for making a small database and it's important to change `PGDATA` to something else, otherwise the `VOLUME` directive (in the official Postgres image) causes a full copy of all the files. It looks promising, though (accepting connections in ~200ms).


You could create a backup with the slow process, restore it, and then the restored database into a template


> The other limitation of template databases to be aware of is that no other sessions can be connected to the source database while it is being copied. CREATE DATABASE will fail if any other connection exists when it starts; during the copy operation, new connections to the source database are prevented. It is an easy enough limitation to work around using a mutex pattern, but it is something to be aware of.

    -- Prevent new connections to template db
    update pg_database
    set datallowconn = false
    where datname = 'my_template_db';

    -- Close all current connections
    select pg_terminate_backend(procpid)
    from pg_stat_activity
    where datname = 'my_template_db';


In .NET we have TestContainers and Respawn.

https://medium.com/@kova98/easy-test-database-reset-in-net-w...


Xunit offers several methods: https://xunit.net/docs/shared-context


I’ve been delighted with this approach and have never had a single glitch. Highly recommended, although I think I’ve heard from some denizens of HN that they’ve got even faster performance by manually copying data files instead of using template databases. That feels ickier, and the overhead I see is only 10-20ms per test so I have no great need to push further. YMMV!


With disclaimer I work at Neon, branching might be a good option here. https://neon.tech/docs/guides/branching-test-queries


I would only consider it if I can do it 100% locally, not sure if that’s the case. And I mean 100% locally without an internet connection


I am using TestContainers, what basically is able to run a Docker image of PostgreSQL, abstracting a lot of details. You can find a working example of this setup for integration tests using Go, testify and PosgreSQL here: https://github.com/dherik/ddd-golang-project

For Java services using MySQL, I was able to use just the H2 database (in-memory) many times. Does a decent job and it's very compatible with MySQL. If you try to avoid specific features from the databases, this in-memory database can do a decent (and fast) job running integration tests.


The whole point of the article is to make it _fast_: can you share how fast it is to spin up your database with TestContainers?


Running an "mvnd clean package" on a very simple Spring Boot app having 4 different spring integration tests takes exactly 8 seconds on my m1 max.

This includes a full build of the application, postgres container startup and spring context startup.

Starting the postgres container seems be less than 2 seconds of those 8 seconds.

Is that fast enough?


From the article:

  However, on its own, template databases are not fast enough for our use case. The time it takes to create a new database from a template database is still too high for running thousands of tests:

  postgres=# CREATE DATABASE foo TEMPLATE contra;
  CREATE DATABASE
  Time: 1999.758 ms (00:02.000)
So no. They say this adds ~33 minutes to running their tests.


You can utilize all technics from article with the help of TestContainers. They will help with setting up DB and providing connection strings to running images


I've never had a problem working https://pgtap.org/ into CI.

I know the article title says "integration tests" but when a lot of functionality is done inside PostgreSQL then you can cover a lot of the test pyramid with unit tests directly in the DB as well.

The test database orchestration from the article pairs really well with pgTAP for isolation.


The article mentions copying databases using templates. Once your database has got big enough, that might not make sense.

You could use a zfs fs for your cluster, snapshot it, then mount it in a new cluster. This would prevent copying the data, as zfs would do copy on write. So you get your isolation, and speed on large DB's. You do need to run a new postgres cluster though.


We do the same thing with Microsoft SQL.

Each time the hash of our migrations change, we make a new template DB. That takes about 30 seconds to run all our migrations..

If the hash didn't change, clone the template of the given hash. This takes less than 1 second.

Some advantages not mentioned in article:

- When developing locally, a new DB clone is made to run my single test interactively. If this took much more than a second I would get rather annoyed.

- Yes a test suite can share a DB for full suite if all tests are written without hardcoding anything and provisiom new IDs always etc. But it is super useful to run a function QueryDump("select * from MyTable") in the test of the code I am working on and the full table dump only having what that single test worked on during debugging, speeds up debugging a lot vs querying for whatever random ID my test allocated.


I work on PGlite [0], integration testing is absolutely something we want it to be useful for. For an early data point the Drizzle integrations tests run in about 1/4 the time with PGlite in-memory than a full Postgres.

The OP mentions lack of extensions being a blocker for them using it now, they are coming soon. Additionally it's currently limited by being single connection, and again I hope we can remove this limitation.

I also have a few other ideas around how to make supper fast testing possible.

http://github.com/electric-sql/pglite


This is intriguing but I'm curious how you handle, or plan to handle, different isolation levels? In the past I've written tests to assert serializable guarantees, for example.


Have you tried pg-mem for integration testing? https://github.com/oguimbal/pg-mem


I independently arrived at a very similar design, but also maintain a hot pool of template instances to hide the latency of spinning one up. The state of the template databases is maintained in another database on the same pg instance.

Another thing to contemplate is utilizing the multi tenancy systems you have in place for isolation. It can be a good way to test that the tenant isolation actually works.


Wish we can do same with MSSQL that we stucked with =| Restoring base from snapshot take no less then 4-5 seconds. No way to run on in memory filesystem without big pile of crutches... Tried to do so with TestContainers only to find out that we are limited by disk IO


We just copy a schema run test on a copy, and drop it afterwards.

It can run in parallel if you name schemas randomly, or by test name.

You only need one DB, no need to even re-connect, start DB cluster or whatever. It can all work with a single persistent connection.

And it's much less resource intensive.


Could probably use tmpfs to store the database, as that defaults to memory (at first at least), I wonder if that is measurably different from mounting a memory based fs.


I tried this and didn't find any improvements, then I realized we had `PG_DATA: /dev/shm/pgdata/data` which I guess has the same effect


do you create /dev/shm/pgdata/data before using it?


Why can't you just create a pool of databases and truncate all tables after each run ? That should be pretty fast


DELETE is faster on a small amount of data.

But yea, DELETEing is faster than creating a DB from template (depends on how much you have to delete, of course). However, templates allow parallelism by giving its test its own database. I ended up doing a mix of both: create a DB if there's not one available, or reuse one if available, always tearing down data.


TRUNCATE is faster than DELETE. You could have 100 dbs, each test first acquires one, runs, truncates, releases. No need to create more dbs on the fly.


Wait, how do you create a "memory disk"? That seems like an important step.


They use the `--tmpfs` flag when calling `docker run`. More info: https://docs.docker.com/storage/tmpfs/

Pretty neat, I didn't know about it before.


Dang! I didn't know that either. Kinda subtle. Blink and you'll miss it.


Wow. So much effort in reinventing the wheel. The transaction approach is obviously not suited for integration testing as commits/rollbacks are part of the package too.

There is testcontainers + flyway/liquibase. Problem solved.


Testcontainers are anything but fast in my experience. With the image pre-pulled, the most basic mysql testcontainer takes about 5 seconds from calling RunContainer to it being ready to go.


I use TestContainers in my Java Spring Boot application coupled with Liquibase. I have it set up so that the PG instance is created once with Liquibase initializing it from vanilla PG to a fully created empty application database, then all the integration tests reuse the same by overriding the stop() method to an empty one.

Running the full suite of tests is about 5-10 mins on a relatively low-powered build server, whereas on my more robust development workstation it's about 3 mins. Getting a TestContainer running an empty DB for integration tests ends up being a very small portion of the time.

Most of the time any work being performed is done in a transaction and rolled back after the test is completed, though there are some instances where I'm testing code that requires multiple transactions with committed data needing to be available. The one ugly area here is that in some of these outlier test cases I need to add code to manually clear out that data.


You're describing a pretty different testing strategy, where you only use one database and ensure isolation through transactions. It's a good strategy for your use-case, but it's not what the article talks about (they do mention this transactional approach and why it doesn't work for them)


The article is discussing the usecase where each test/set of assertions as a clean DB. I.e. hundreds of DBs per test run.

Are you talking about a single DB reused for each test? That is of course no problem...

Our suite spends 3 to 10 minutes to run too. It provisions about a hundred databases (takes 1 sec each DB template clone.. the wall clock is running stuff in parallell)

Also, every time we change code and run a related test locally we make a new DB to run its test. If that took more than 1 second I would go crazy.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: