Pipeline on Oracle cdap to BigQuery Multitables

681 views Asked by At

I am building a Pipeline on the cdap, where I have an oracle database where I connect and get a table, then connect this data to the BigQuery Multitables component.

Individually both components were validated and by the cdap tool itself, when I tested the execution of the complete pipeline I received the error:

ERROR Spark program 'phase1' failed with error: BQ_TEST has no outputs.Please check that the sink calls addOutput at some point.

enter image description here

2

There are 2 answers

0
Yaojie Feng On

To use the bigquery multi sink, you will need to set some runtime arguments to tell the sink which table to write to. The key of the arguments will be like multisink.{dataset-name}.{table-name}, and the value of the arguments will be the json string representation of the table schema.

0
xgMz On

It sounds like the source might not have any records.

Adding to the response of @Yaojie Feng, the sink needs the schema in Avro format, however, the Multiple Database Tables source plugin would produce the schema required by the BigQuery Multi Table plug-in, example below.

Sample pipeline runtime arguments with schema in Avro format:

Key: multisink.NEW_TABLE_NAME

Value:

{ 
  "name": "NEW_TABLE_NAME", 
  "type": "record", 
  "fields": [
      {"name": "id", "type": "long" }, 
      { "name": "name", "type": "string"} 
    ]
}

Source.