Hi there, I'm one of the pgroll authors :) I could be mistaken here, but I belie...

gvkhna · 2023-10-04T02:37:29.000000Z

Thanks for the reply, a write up on pros/cons in these approaches would be fantastic. I have no clue which is better but I believe pgosc is heavily inspired by github/gh-ost, their tool for online schema change for mysql.

brycethornton · 2023-10-03T16:17:18.000000Z

Does pgroll have any process to address table bloat after the migration? One of the (many) nice things about pg-osc is that it results in a fresh new table without bloat.

surjection · 2023-10-03T17:03:20.000000Z

Another pgroll author here :)

I'm not very familiar with pg-osc, but migrations with pgroll are a two phase process - an 'in progress' phase, during which both old and new versions of the schema are accessible to client applications, and a 'complete' phase after which only the latest version of the schema is available.

To support the 'in progress' phase, some migrations (such as adding a constraint) require creating a new column and backfilling data into it. Triggers are also created to keep both old and new columns in sync. So during this phase there is 'bloat' in the table in the sense that this extra column and the triggers are present.

Once completed however, the old version of this column is dropped from the table along with any triggers so there there is no bloat left behind after the migration is done.

brycethornton · 2023-10-03T17:21:41.000000Z

Thanks for the reply. My question was specifically about the MVCC feature that creates new rows for updates like this. If you're backfilling data into a new column then you'll likely end up creating new rows for the entire table and the space for the old rows will be marked for re-use via auto-vacuuming. Anyway, bloat like this is a big pain for me when make migrations on huge tables. It doesn't sound like this type of bloat cleanup is a goal for pgroll. Regardless, it's always great to have more options in this space. Thanks for your work!

pritambaral · 2023-10-03T17:18:42.000000Z

> ... so there there is no bloat left behind after the migration is done.

This is only true after all rows are rewritten after the old column is dropped. In standard, unmodified Postgres, DROP COLUMN does not rewrite existing tuples.