> Anyone know how well this works with very large datasets? The backfill sounds like it would take a while to do.
It can take a long time, yes. It's somehow similar in that regard with, for example, gh-ost for Mysql that also does backfills. The advantage of Postgres here, is that backfill is required for fewer migration types, and pgroll only does backfills when needed.
> Does this Go binary need to be continuously running, or does it keep track of migration state in the database?
The latter, you only run the Go binary when doing schema changes.
How can you properly plan for eg. disk storage requirements etc. Does the tool calculate that upfront via some sort of dry-run mode? For companies with larger datasets this would be a rather important consideration. Also, those backfills will generate a lot of network traffic in clustered environments.
This is a good point, I believe we can look into trying to estimate storage needs or timings before a migration. It definitely looks like a nice to have.
It can take a long time, yes. It's somehow similar in that regard with, for example, gh-ost for Mysql that also does backfills. The advantage of Postgres here, is that backfill is required for fewer migration types, and pgroll only does backfills when needed.
> Does this Go binary need to be continuously running, or does it keep track of migration state in the database?
The latter, you only run the Go binary when doing schema changes.