Following the link I found in Google, I'm trying to do a sample setup to publish message in pubsub and load the same into bigquery table using dataflow sql.
But when I create dataflow job am getting below error:
Invalid/unsupported arguments for SQL job launch: Invalid table specification in Data Catalog: Unsupported schema specified for Pubsub source in CREATE TABLE.CREATE TABLE for Pubsub topic must include at least 'event_timestamp' field of type 'TIMESTAMP'"
Kindly help me to fix this and clarify my below doubts:
- Is it mandatory to keep event_timestamp field in pubsub schema/dataflow sql/bigquery table?
- When I create pubsub topic with schema it didnt reflect in dataflow sql whereas when I assign it manually from cloud shell using gcloud data-catalog entries update it reflects in dataflow sql when searching the topic name it showed the schema. So which is the right method to assign schema to pubsub topic
- Data catalog also not showing the schema assigned to the pubsub topic.
Let me know if anymore details are required.
I was able to follow the documentation and it yielded successful results. Response to your questions:
1. Is it mandatory to keep event_timestamp field in pubsub schema/dataflow sql/bigquery table?
Yes it is necessary in this scenario since in the following query, it uses TUMBLE function and
event_timestampcolumn is a DESCRIPTOR. Note: For a Pub/Sub source, you must specify the event_timestamp field as the timestamp_column:2. When I create pubsub topic with schema it didnt reflect in dataflow sql whereas when I assign it manually from cloud shell using gcloud data-catalog entries update it reflects in dataflow sql when searching the topic name it showed the schema. So which is the right method to assign schema to pubsub topic
You can use console/gcloud on assigning a schema. However, when using console/gcloud command, these are subjected to the following limitations:
You can use
gcloud data-catalog entries updatewhen updating an existing schema.3. Data catalog also not showing the schema assigned to the pubsub topic.
You may use
gcloud data-catalog entries lookupand let me know.