What tool to use for text file that is tab-delimited where string fields are surrounded by quotes?

421 views Asked by At

As the title suggests, I've run into numerous dead ends when trying to find the correct tool to query my text file. Initially, I tried to import the data into Big Query but ran into an assortment of errors. I then transformed the data into a .csv file via GCP's Dataprep service, then imported that version into Big Query which led to another set of errors. I've tried removing the double quotes from the data using tr command in terminal and rerunning the dataprep/bigquery jobs to no avail. I've run out of options and I'm losing my mind. Please help.

Example of the data listed below:

"002086182-02" "" "ZUPLANTI" "GARY" "" "" "F" 02/15/1965 04/11/1989 "A" 04/01/1996 "D" "291" "" "MCMASTER ST" "" "PENN HILLS" "PA" "15222" "" "" "" "" "" 11/03/2020 "2260001" "2260001-1" 05/08/2021 "117684001" "2260001" "" "SC37" "MN226" "MD305" "STH035" "STS45""USC18" "CPC00" "CAC09"

1

There are 1 answers

3
ewertonvsilva On

The text you have posted is separated by spaces, not by tabs.

If your text is separated by spaces, use the following command:

cat file.csv | tr -s ' ' ',' |tr -d '"' > new_file.csv

Or, if your file is separated by tabs, use the following:

cat file.csv | tr -s '\t' ',' |tr -d '"' > new_file.csv

enter image description here

The result:

enter image description here