I'm querying redshift spectrum and certain fields are showing up null without any explanation. I've checked
- SVL_S3LOG
- SVL_SPECTRUM_SCAN_ERROR
- SYS_EXTERNAL_QUERY_ERROR
And they are all empty.
In the below example, how do I force redshift to error rather than just return null?
I've tried playing around with table properties such as "data_cleansing_enabled" and related properties. I've also tried
set spectrum_query_maxerror=0
but the result is always the same.
CREATE EXTERNAL TABLE myschema.test (
numberfield decimal(18,3),
"timestamp" text
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
STORED AS TEXTFILE
location 's3://mybucket/data/'
select * from myschema.test
The contents of the file at s3://mybucket/data/test.json:
-- note numberfield has more than 3 decimal digits
{"numberfield":"98.7400000", "timestamp": "2023-01-01T20:21:40.100226269+00:00"}
In the above file, the "numberfield" value has more tha 3 decimal digits and is therefore returning Null. I'm not sure why this is happening - I would have expected it to round to the appropriate place as stated by the redshift docs