I have a schema for BigQuery in which the Record field is JSON-like, however, the keys in the JSON are dynamic i.e. new keys might emerge with new data and it is hard to know how many keys in total there are. According to my understanding, it is not possible to use BigQuery for such a table since the schema of the record field type needs to be explicitly defined or else it will throw an error.
The only other alternative is to use JSON_EXTRACT function while querying the data which would parse through the JSON (text) field. Is there any other way we can have dynamic nested schemas in a table in BigQuery?
A fixed schema can be created for common fields, and you can set them as nullable. And a column as type string can be used to store the rest of the JSON and use the JSON Functions to query for data.
We all the time have a
meta
column in our table, which holds additional raw unstructured data as a JSON object.Please note that currently you can store up to 2 Megabytes in a string column, which is decent for a JSON document.
To make it easier to deal with the data, you can create views from your queries that use JSON_EXTRACT, and reference the view table in some other more simpler query.
Also at the streaming insert phase, your app could denormalize the JSON into proper tables.