CSV Export from Bigquery having columns surrounded with double quotes adding not working

477 views Asked by At

I am trying to export a table as CSV file in GCS bucket from Google Big Query console with export command, but facing issues with double quotes.

I want to export as csv but every column value should be surrounded with double quotes, i tried using various String functions such as FORMAT, CONCAT, REPLACE etc, but not able to get desired results.

I am trying to export as csv because only these formats are supported by Export command (CSV, JSON, Parquet, Avro) , even a plain text file is fine for me, but don't have that option and i need columns surrounded by double quotes and delimited by comma.

this is export command:

EXPORT DATA OPTIONS( uri='gs://bucket-name123/SOME-FILE-*', format='CSV', overwrite=true, header=false, field_delimiter=',')
AS
SELECT CONCAT('\"',column1,'\"'), CONCAT('\"',column2,'\"') from `projectId.datasetId.TABLE_NAME` where column3 = true;

Data is getting exported successfully in GCS bucket , but i am getting 3 double quotes surrounding this column instead of 1 double quotes:

content of GCS exported GCS file is as below:

"""value1""","""AAA""""
"""value2""","""BBB""""
"""value3""","""CCC""""

I am expecting this content format:

"value1","AAA"
"value2","BBB"

Any help or hint will be appreciated,

I have also tried FORMAT('%s',column1) -- this function, but it also produces 3 double quotes

I also tried to hit EXPORT query as a job from Java Google BQ api, but same result.

I can't use "bq extract" because i want to only export limited columns, and extract don't have functionality to export limited columns, it exports whole table.

1

There are 1 answers

2
dlebech On

Don't add double quotes yourself. When you export to CSV from BigQuery, double quotes are added automatically, if they are needed, e.g. for multi-line strings and escaping double quotes (which is why you are seeing repeated double quotes in your output)

So you command should simply be:

EXPORT DATA OPTIONS(
  uri='gs://bucket-name123/SOME-FILE-*', 
  format='CSV',
  overwrite=true,
  header=false,
  field_delimiter=',')
AS SELECT column1, column2 from projectId.datasetId.TABLE_NAME
WHERE column3 = true;

Most CSV readers will be able to read the output format.