You could certainly get the same effect without downtime prior to this. It just ...

philliphaydon · on Aug 29, 2018

This is incredibly slow updating a billion rows to give a value to a new column as a blanket update will still be run as an implicit transaction. As does the constraint check before the default is applied. It can also use a LOT of disk space to do this as well.

baq · on Aug 29, 2018

We always did that in batches about 1k to 10k rows. Takes a while but at least doesn't hammer the DB.

gldalmaso · on Aug 29, 2018

If you are doing this, pay attention to the amount of dead tuples being accumulated over batches. We had to vaccuum analyze after each batch at least on one occasion where the index stopped being used by the query planner after a certain threshold of dead tuples. Even though there was no table lock, the performance hit made the application unusable from long running queries.

The following query will show dead tuple count for the table: select tuple_count, dead_tuple_count from pgstattuple('table_name');

gcbirzan · on Aug 29, 2018

The article does address this, it still has to check the DB for NULLs after you're done. Less downtime, but for huge tables it does take a significant amount of time.

anarazel · on Aug 29, 2018

This optimization applies only for new columns with a stable default. Therefore a table scan is not required to check for NULLs, as either the default is NULL or not.

gcbirzan · on Aug 29, 2018

I was talking about the way to do it before. You could create a NULL-able column, add the default in batches, add the default/make it non-NULL-able. But, the last part took some time (not in the range of ms) for large tables.

pilif · on Aug 29, 2018

> But, the last part took some time (not in the range of ms) for large tables

create a temporary index covering that column. Then it will be in the range of ms.

pilif · on Aug 29, 2018

you can create a temporary conditional index on that table, then the check will be instantaneous. There is a `create index concurrently` that won't cause locks on the table.

With this the re-check for nulls when adding the not null constraint will not take any time at all.

haney · on Aug 29, 2018

Sorry, I should have clarified “accidentally downtime”. This doesn’t make anything new possibly, but will definitely eliminate a common way to shoot one’s self in the foot.