I'm using Datastream to replicate data from MySQL to BigQuery, but I want to exclude the Datastream UUID column that's added to my BigQuery tables. However, I still need to maintain CDC (Change Data Capture) functionality. How can I achieve this?
Here's a breakdown of my setup:
- I'm using Datastream to replicate data from MySQL to BigQuery.
- Datastream automatically adds a UUID column to the replicated tables in BigQuery.
According to this documentation, metadata fields are automatically included when an event is generated. With Datastream, you can specify include and exclude lists for tables and schemas, to stream only the data that you want from a source to a destination. For included tables, you can exclude specific columns of the tables to further fine-tune exactly which data you want to be streamed into the destination.
Based on my understanding we cannot turn off creating metadata fields columns in the BigQuery. As a workaround you can consider creating a view including only the column you wanted from the bigquery table created by the datastream. If you want this feature in datastream you can raise a feature request in Issue Tracker.