Is there a way to include element index when using FLATTEN in a Dremio query?

670 views Asked by At

I have a number of rows in a parquet file exposed via Dremio, e.g.:

id data
1 [1.5,8.5,23.004]
2 [0.3,4.44, 2.59]

What I'm trying to achieve is a query that will flatten the array but give me an explicit index so I know which row pertains to which element in the array. The reason I need this is because I have another equivalent table where the 'data' values represent adjustment values that need to be queried and applied to the original data.

What I'm looking for are results like this:

id data_element idx
1 1.5 0
1 8.5 1
1 23.004 2
2 0.3 0
2 4.44 1
2 2.59 2

It looks like this needs some kind of windows function like ROW_NUMBER() OVER(...) but although window functions are available in Dremio, I can't get it to work.

Does anyone know if: a) it is possible to reference the index of the array element when flattening? b) the order that is returned when using flatten is always guaranteed to be the same as the order in the original array?

If b) is true, then I don't really need to worry about a).

1

There are 1 answers

0
MatBailie On

A).

You may need to include the index in the data, perhaps by nesting arrays. For example, if the raw data was [ [0, 1.5] , [1, 8.5] , [2, 23.004] ] then flatten would return three rows with an array as each datum, the first element of that array being the index and the second element being the data value you actually want.


B).

SQL Datasets do not have an implicit order, rather they're explicitly unordered. The properties of ordering only exist After the creation of columns from which the ordering can be derived.