Basically you create docker container from some postgres image.
Then you run DDL scripts.
Then you stop this container and commit it as a new image.
And now you can create new container from this new image and use it for test. You can even parallelize tests by launching multiple containers. It should be fast enough thanks to docker overlay magic.
And it should work with any persistent solution, not just postgres.
You can save some time and complexity and just run a single container and first, once, set up your template database `my_template`, then create your testing databases using `CREATE DATABASE foo TEMPLATE my_template`. Basically TFA.
This will be much faster than restarting postgres a bunch of times, since this will just `cp` the database files on disk from the template to the new database.
The only "problem" is your application will need to switch to the new database name. You can also just put pgbouncer in front and let it solve that, if you want.
I would think the fundamental issue with this is similar to what the author described with template databases:
> However, on its own, template databases are not fast enough for our use case. The time it takes to create a new database from a template database is still too high for running thousands of tests:
And then in the timing shows that this took about 2 seconds. Launching another container is surely going to be at least that slow, correct?
So it's clear the author is trying to get an "absolutely clean slate" for each of potentially many tests. That may not be what all teams need, but I will say we had an absolute beast of a time as we grew our test suite that, as we parallelized it, we would get random tests failures for tests stepping on each other's toes, so I really like the approach of starting with a totally clean template for each test.
Starting each test with a totally clean template ensures consistent reproducibility, I’ll give you that, but it also isn’t real world, either. You only have the data that the test seeds which favors tests (and by extension business logic) written only with the “happy path” in mind. I think the smell for when tests are stepping on each other causing flakey runs, is that the logic being tested isn’t written for non-happy paths, or, the tests are asserting entirely too specific datasets or outcomes and anything else results in a failure. In the real world, other systems may very well be touching the same data or tables that your code is interacting with, so it being able to handle that kind of situation will produce a more fault tolerant system overall, which will serve to deliver value even if other systems go haywire and produce unexpected data you are looking at. Of course the need to handle that extra complexity is going to vary depending on business needs or other determining factors.
This is absolutely the correct answer. Testing infra should be ephemeral and mostly stateless. Prior to docker you had to figure out ways to mock the database or use something to approximate it with a lite weight DB like H2 or SqlLite.
With docker you can build out the test image with default usernames/passwords/etc...
Then as your install gets more complicated with stored procedures and the like you can add them to your test database and update local testing tooling and CI/CD to use that.
The massive benefit here is that you're using the exact same code to power your tests as you use to power your production systems. This eliminates issues that are the caused by differences between prod & test environments and anyone who's debugged those issues know how long they can take because it can take a really long time to figure out that is where the issue lies.
This is a great start, but I’d recommend three more improvements:
- Follow this guide to disable all durability settings. We don’t care if the DB can recover from a crash, since it’s only test data: https://www.postgresql.org/docs/current/non-durability.html (I wouldn’t worry about unclogged tables, personally)
- Set wal_level=minimal, which requires max_wal_senders=0. This reduces the amount of data written to [mem]disk in the first place.
- The big one: create a volume in /var/run/postgresql/ and share it with your application so that you can connect over the Unix domain socket rather than TCP. This is substantially faster, especially when you create new connections per test (or per thread).
In Ruby on Rails there is a way to do multi-request integration tests with single PostgreSQL/MariaDB database where transactions are used. The trick is to keep and share just one connection to database.
With PostgreSQL at the beginning of the test the outer transaction is opened, then connection is shared with application, test steps are performed (including Selenium etc), requests open/close nested transactions etc. Once steps are done the test closes the outer transaction and the database is back to initial state for the next test. It is very simple and handled by the framework.
In fact it can be even more sophisticated. E.g. an integration test class can preload some data (fixtures) for all the test cases of the class. For that the another nested transaction is used to not repeat the data load process for every test case.
Interesting, we've just been looking at something similar at work. Unfortunately we've got a bunch of application config and test data in the database and no fast process for seeding it. We're now building a Postgres Docker image with a baked-in database. There were some tricks for making a small database and it's important to change `PGDATA` to something else, otherwise the `VOLUME` directive (in the official Postgres image) causes a full copy of all the files. It looks promising, though (accepting connections in ~200ms).
> The other limitation of template databases to be aware of is that no other sessions can be connected to the source database while it is being copied. CREATE DATABASE will fail if any other connection exists when it starts; during the copy operation, new connections to the source database are prevented. It is an easy enough limitation to work around using a mutex pattern, but it is something to be aware of.
-- Prevent new connections to template db
update pg_database
set datallowconn = false
where datname = 'my_template_db';
-- Close all current connections
select pg_terminate_backend(procpid)
from pg_stat_activity
where datname = 'my_template_db';
I’ve been delighted with this approach and have never had a single glitch. Highly recommended, although I think I’ve heard from some denizens of HN that they’ve got even faster performance by manually copying data files instead of using template databases. That feels ickier, and the overhead I see is only 10-20ms per test so I have no great need to push further. YMMV!
I am using TestContainers, what basically is able to run a Docker image of PostgreSQL, abstracting a lot of details. You can find a working example of this setup for integration tests using Go, testify and PosgreSQL here: https://github.com/dherik/ddd-golang-project
For Java services using MySQL, I was able to use just the H2 database (in-memory) many times. Does a decent job and it's very compatible with MySQL. If you try to avoid specific features from the databases, this in-memory database can do a decent (and fast) job running integration tests.
However, on its own, template databases are not fast enough for our use case. The time it takes to create a new database from a template database is still too high for running thousands of tests:
postgres=# CREATE DATABASE foo TEMPLATE contra;
CREATE DATABASE
Time: 1999.758 ms (00:02.000)
So no. They say this adds ~33 minutes to running their tests.
You can utilize all technics from article with the help of TestContainers.
They will help with setting up DB and providing connection strings to running images
I know the article title says "integration tests" but when a lot of functionality is done inside PostgreSQL then you can cover a lot of the test pyramid with unit tests directly in the DB as well.
The test database orchestration from the article pairs really well with pgTAP for isolation.
The article mentions copying databases using templates. Once your database has got big enough, that might not make sense.
You could use a zfs fs for your cluster, snapshot it, then mount it in a new cluster. This would prevent copying the data, as zfs would do copy on write. So you get your isolation, and speed on large DB's. You do need to run a new postgres cluster though.
Each time the hash of our migrations change, we make a new template DB. That takes about 30 seconds to run all our migrations..
If the hash didn't change, clone the template of the given hash. This takes less than 1 second.
Some advantages not mentioned in article:
- When developing locally, a new DB clone is made to run my single test interactively. If this took much more than a second I would get rather annoyed.
- Yes a test suite can share a DB for full suite if all tests are written without hardcoding anything and provisiom new IDs always etc. But it is super useful to run a function QueryDump("select * from MyTable") in the test of the code I am working on and the full table dump only having what that single test worked on during debugging, speeds up debugging a lot vs querying for whatever random ID my test allocated.
I work on PGlite [0], integration testing is absolutely something we want it to be useful for. For an early data point the Drizzle integrations tests run in about 1/4 the time with PGlite in-memory than a full Postgres.
The OP mentions lack of extensions being a blocker for them using it now, they are coming soon. Additionally it's currently limited by being single connection, and again I hope we can remove this limitation.
I also have a few other ideas around how to make supper fast testing possible.
This is intriguing but I'm curious how you handle, or plan to handle, different isolation levels? In the past I've written tests to assert serializable guarantees, for example.
I independently arrived at a very similar design, but also maintain a hot pool of template instances to hide the latency of spinning one up. The state of the template databases is maintained in another database on the same pg instance.
Another thing to contemplate is utilizing the multi tenancy systems you have in place for isolation. It can be a good way to test that the tenant isolation actually works.
Wish we can do same with MSSQL that we stucked with =|
Restoring base from snapshot take no less then 4-5 seconds.
No way to run on in memory filesystem without big pile of crutches...
Tried to do so with TestContainers only to find out that we are limited by disk IO
Could probably use tmpfs to store the database, as that defaults to memory (at first at least), I wonder if that is measurably different from mounting a memory based fs.
But yea, DELETEing is faster than creating a DB from template (depends on how much you have to delete, of course). However, templates allow parallelism by giving its test its own database. I ended up doing a mix of both: create a DB if there's not one available, or reuse one if available, always tearing down data.
TRUNCATE is faster than DELETE. You could have 100 dbs, each test first acquires one, runs, truncates, releases. No need to create more dbs on the fly.
Wow. So much effort in reinventing the wheel. The transaction approach is obviously not suited for integration testing as commits/rollbacks are part of the package too.
There is testcontainers + flyway/liquibase. Problem solved.
Testcontainers are anything but fast in my experience. With the image pre-pulled, the most basic mysql testcontainer takes about 5 seconds from calling RunContainer to it being ready to go.
I use TestContainers in my Java Spring Boot application coupled with Liquibase. I have it set up so that the PG instance is created once with Liquibase initializing it from vanilla PG to a fully created empty application database, then all the integration tests reuse the same by overriding the stop() method to an empty one.
Running the full suite of tests is about 5-10 mins on a relatively low-powered build server, whereas on my more robust development workstation it's about 3 mins. Getting a TestContainer running an empty DB for integration tests ends up being a very small portion of the time.
Most of the time any work being performed is done in a transaction and rolled back after the test is completed, though there are some instances where I'm testing code that requires multiple transactions with committed data needing to be available. The one ugly area here is that in some of these outlier test cases I need to add code to manually clear out that data.
You're describing a pretty different testing strategy, where you only use one database and ensure isolation through transactions. It's a good strategy for your use-case, but it's not what the article talks about (they do mention this transactional approach and why it doesn't work for them)
The article is discussing the usecase where each test/set of assertions as a clean DB. I.e. hundreds of DBs per test run.
Are you talking about a single DB reused for each test? That is of course no problem...
Our suite spends 3 to 10 minutes to run too. It provisions about a hundred databases (takes 1 sec each DB template clone.. the wall clock is running stuff in parallell)
Also, every time we change code and run a related test locally we make a new DB to run its test. If that took more than 1 second I would go crazy.
Basically you create docker container from some postgres image.
Then you run DDL scripts.
Then you stop this container and commit it as a new image.
And now you can create new container from this new image and use it for test. You can even parallelize tests by launching multiple containers. It should be fast enough thanks to docker overlay magic.
And it should work with any persistent solution, not just postgres.