42703 ERROR: column "my_nested_column" does not exist

337 views Asked by At

I run a Glue Crawler over a nested JSON data-source on S3 and I tried to query nested fields as per documentation via Redshift Spectrum:

select c.id , c.my_nested_column.MyField
from my_external_schema.my_table c;

But as per title I was getting the error message

[42703] ERROR: column "my_nested_column" does not exist

which doesn't really make sense as from metadata I can see the field exists. But because of this I'm unable to unnest fields from "my_nested_column".

How to fix this?

1

There are 1 answers

0
Vzzarr On

After some investigations, I noticed that one of the fields of the JSONs being parsed contained a colon within the field name and it was something like my:field.

This clashes with the JSON logic and doesn't work really well. I removed this field from the Glue Catalog and afterwards I was able to query the field correctly.

The initial error message I was getting really didn't help but the problem was caused by malformed JSON field name.