Hacker News new | past | comments | ask | show | jobs | submit login

This looks very nice indeed but I see a few possible problems which I have seen with pg_repack which might apply to this approach as well:

You can't change table names unless you take a lock. How exactly do you switch the original table to be a view pointing to the original table? The docs don't go into detail how this is done exactly, I'll check the code later.

It looks like the tool maintains two copies of the table but how exactly this copy process is done isn't explained. A potential issue is that you need to have disk space and I/O capacity available to support this.

The copy table + trigger approach might not work for databases of significant size. For example I have seen instances with >50k qps on a table where it is not possible to run pg_repack because it never catches up and it also doesn't ever manage to take the lock which is needed to switch to the new table. This can be simulated with overlapping long running queries.




> This looks very nice indeed but I see a few possible problems which I have seen with pg_repack which might apply to this approach as well:

Thank you for your input! I'm one of the pgroll authors :)

> You can't change table names unless you take a lock. How exactly do you switch the original table to be a view pointing to the original table? The docs don't go into detail how this is done exactly, I'll check the code later.

pgroll only performs operations requiring a short lock, like renaming a table. It sets a lock timeout for these operations (500ms by default), to ensure we avoid lock contention if other operations are taking place. We plan to add an automatic retry mechanism for these timeouts so there is no need for manual intervention.

One cool thing about views is that they will automatically get updated when you rename a table/column, so the view keeps working after the rename.

> It looks like the tool maintains two copies of the table but how exactly this copy process is done isn't explained. A potential issue is that you need to have disk space and I/O capacity available to support this. > The copy table + trigger approach might not work for databases of significant size. For example I have seen instances with >50k qps on a table where it is not possible to run pg_repack because it never catches up and it also doesn't ever manage to take the lock which is needed to switch to the new table. This can be simulated with overlapping long running queries.

pgroll doesn't really copy full tables, but individual columns when needed (for instance when there is a constraint change). It is true that I/O can become an issue, backfilling is batched but the system should have enough capacity for it to happen. There are some opportunities to monitor I/O and throttle backfilling based on it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: