I am loading a bunch log files into BigQuery using apache beam data flow. The file format can change over a period of time by adding new columns to the files. I see Schema Update Option ALLOW_FILED_ADDITION
.
Anyone know how to use it? This is how my WriteToBQ
step looks:
| 'write to bigquery' >> beam.io.WriteToBigQuery('project:datasetId.tableId', ,write_disposition=beam.io.BigQueryDisposition.WRITE_APPEND)
I haven't actually tried this yet but digging into the documentation, it seems you are able to pass whatever configuration you like to the BigQuery Load Job using
additional_bq_parameters
. In this case it might look something like:Weirdly, this is actually in the Java SDK but doesn't seem to have made its way to the Python SDK.