Arrays not supported in Bigquery Python API

1.1k views Asked by At

The support for python Bigquery API indicates that arrays are possible, however, when passing from a pandas dataframe to bigquery there is a pyarrow struct issue.

The only way round it seems its to drop columns then use JSON Normalise for a separate table.

'''from google.cloud import bigquery
 project = 'lake'
 client = bigquery.Client(credentials=credentials, project=project)
 dataset_ref = client.dataset('XXX')
 table_ref = dataset_ref.table('RAW_XXX')
 job_config = bigquery.LoadJobConfig()
 job_config.autodetect = True
 job_config.write_disposition = 'WRITE_TRUNCATE'

 client.load_table_from_dataframe(appended_data, table_ref,job_config=job_config).result()'''

This is the error recieved. NotImplementedError: struct

1

There are 1 answers

0
Héctor Neri On

This is currently not supported due to how parquet serialization works.

A feature request to upload pandas DataFrame containing arrays was created at the client library's GitHub:

https://github.com/googleapis/google-cloud-python/issues/8544