Redshift Spectrum returns null WITHOUT error

69 views Asked by At

I'm querying redshift spectrum and certain fields are showing up null without any explanation. I've checked

  • SVL_S3LOG
  • SVL_SPECTRUM_SCAN_ERROR
  • SYS_EXTERNAL_QUERY_ERROR

And they are all empty.

In the below example, how do I force redshift to error rather than just return null?

I've tried playing around with table properties such as "data_cleansing_enabled" and related properties. I've also tried

set spectrum_query_maxerror=0

but the result is always the same.

CREATE EXTERNAL TABLE myschema.test (
    numberfield decimal(18,3),
    "timestamp" text
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
STORED AS TEXTFILE
location 's3://mybucket/data/'

select * from myschema.test

The contents of the file at s3://mybucket/data/test.json:

-- note numberfield has more than 3 decimal digits 
{"numberfield":"98.7400000", "timestamp": "2023-01-01T20:21:40.100226269+00:00"}

In the above file, the "numberfield" value has more tha 3 decimal digits and is therefore returning Null. I'm not sure why this is happening - I would have expected it to round to the appropriate place as stated by the redshift docs

0

There are 0 answers