I can't unnest sub arrays on athena

1.2k views Asked by At

I am using athena with JSON files and nested fields. How can I unnest sub levels?

This platform part of the schema:

`platforms` array<struct<key_name:boolean,key_name:string,key_name:boolean,key_name:string,
tags:array<string>,

A data sample if I directly query the platform field without unnest it:

[{registrationcomplete=true, source=mail, invitesent=true, label=test, tags=[top], ...

My issue is I can unnest the first level with athena:

UNNEST(platforms) AS t1 (platform);

I am running a query on multiple tables and when I call my sub array like this it doesn't work:

array_join('a.platform.tags', ',') AS plat_tags,

I get this error because the unnest changed the type:

SYNTAX_ERROR: line 6:1: Unexpected parameters (varchar(15), varchar(1)) for function array_join. Expected: array_join(array(T), varchar) T, array_join(array(T), varchar, varchar) T

If I perform a simple query on only one table this code is working:

array_join('platform.tags', ',') AS plat_tags,

I would like to know if a workaround exists or better directly transform my sub array tags like the other unnest!

because if I use this piece of code this works but the result is in an array which is not handled by aws quicksight.

platform.tags AS plat_tags,
1

There are 1 answers

0
Theo On BEST ANSWER

You are quoting platform.tags with single quotes, which is SQL syntax for strings.

To quote an identifier you must use double quotes, and put the quotes around each components (i.e. not the dot separating them). In this case I think quoting is unnecessary so write it like this:

array_join(platform.tags, ',') AS plat_tags

But if you want to quote the identifiers, write it like this:

array_join("platform"."tags", ',') AS plat_tags