With aws athena services, I try to import csv file including new line data
Importing data uses hive serde format.
If data is like this, (each data is enclosed in double quotes. "")
with new line"|"Data3"
then how to write regular expressions to below table DDL?
CREATE EXTERNAL TABLE ssdm_schema.ABCTable_regex (
Data_A VARCHAR(100)
, Data_B VARCHAR(100)
, Data_C VARCHAR(100)
) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
"input.regex" = '?????????'
I'm asking to this question referring to the following answer.
How to handle embed line breaks in AWS Athena
Thank you
Solved it. https://regex101.com/r/bYF1Zm/3
with theg
lobal andu
nicode flags set.There were three things making this tricky:
This regex can probably be more succinct because the matching pattern repeats.