Hacker News new | past | comments | ask | show | jobs | submit login

Hi there, I'm one of the pgroll authors :)

I could be mistaken here, but I believe that pg-osc and pgroll use similar approaches to ensuring no locking or how backfilling happens.

While pg-osc uses a shadow table and switches to it at the end of the process, pgroll creates shadow columns within the existing table and leverages views to expose old and new versions of the schema at the same time. Having both versions available means you can deploy the new version of the client app in parallel to the old one, and perform an instant rollback if needed.




Thanks for the reply, a write up on pros/cons in these approaches would be fantastic. I have no clue which is better but I believe pgosc is heavily inspired by github/gh-ost, their tool for online schema change for mysql.


Does pgroll have any process to address table bloat after the migration? One of the (many) nice things about pg-osc is that it results in a fresh new table without bloat.


Another pgroll author here :)

I'm not very familiar with pg-osc, but migrations with pgroll are a two phase process - an 'in progress' phase, during which both old and new versions of the schema are accessible to client applications, and a 'complete' phase after which only the latest version of the schema is available.

To support the 'in progress' phase, some migrations (such as adding a constraint) require creating a new column and backfilling data into it. Triggers are also created to keep both old and new columns in sync. So during this phase there is 'bloat' in the table in the sense that this extra column and the triggers are present.

Once completed however, the old version of this column is dropped from the table along with any triggers so there there is no bloat left behind after the migration is done.


Thanks for the reply. My question was specifically about the MVCC feature that creates new rows for updates like this. If you're backfilling data into a new column then you'll likely end up creating new rows for the entire table and the space for the old rows will be marked for re-use via auto-vacuuming. Anyway, bloat like this is a big pain for me when make migrations on huge tables. It doesn't sound like this type of bloat cleanup is a goal for pgroll. Regardless, it's always great to have more options in this space. Thanks for your work!


> ... so there there is no bloat left behind after the migration is done.

This is only true after all rows are rewritten after the old column is dropped. In standard, unmodified Postgres, DROP COLUMN does not rewrite existing tuples.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: