Best way to add new columns

1k views Asked by At

What are best practices around adding new columns to tables? How do people usually do this since non-NULL defaults aren't allowed?

1

There are 1 answers

0
Mike Curtiss On

There are a few steps:

  1. Create your new column.
  2. Run a job that populates the new column however you wish. If you don't have too much data, you could have a single process that updates every row (or perhaps does it in batches of N for slightly better performance). Or you could use Dataflow for bulk import. Or you could have your application code do it, but it must be tolerant of NULL values.
  3. [optional] If you want to add a NOT NULL constraint, you can do it (assuming you've actually populated every row). If you have other processes that are adding rows in parallel, then those processes will need to be updated to also populate your new column, else you'll be playing a never-ending game of catch-up. Note that, when a NOT NULL constraint is added, cloud spanner initiates a background process to validate your data (i.e. checks that all cells are not null) before actually finalizing the schema change. More info on data validation is here.
  4. At this point, you can update your application code to start reading & depending on your new column.

More info is here.