Hacker News new | past | comments | ask | show | jobs | submit login

> It does not take locks, other than for very briefly.

You are right. I think this is a misunderstanding in the original material: if one does not use a default expression (i.e. the default is NULL) then the implementation they are using -- Postgres -- will just twiddle the catalog information. My guess is that they did not know this detailed system artifact -- adding a column of NULL value to table of billions of records is for most purposes as fast as a table of nothing.

No need to do such a fancy switchover of tables, although one will still need to backfill a desired default, as you have indicated in step 3.




For people wanting to then use this solution, the detailed explanation is that you 1) make the column without a default; 2) modify your application logic as you would otherwise already have to fill in the column during insert and maintain it during update, but not to rely on or use the column for updating other data; 3) alter the column to add the default, to be used for new rows past that point (you do this after step #2, as otherwise you won't know which values should be defaulted in step #4 and which values need special attention); and finally 4) update, in reasonably-sized batches, all of the old rows that are still null to the "correct value" (which might be the default, but might be a calculation based on other data).

(edit, as I forgot the conclusion:) At this point you will have all of your data filled in correctly, so you alter the column to add a not null constraint (if you had wanted that, which you probably did as otherwise you likely wouldn't have needed a non-null default in the first place) and then go back to into your application and go about business as usual using your new column in ways that would affect other data.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: