How to avoid duplicates while doing unnest of multiple columns based on comma delimiter in bigquery?

47 views Asked by At

I am using below query to split the comma separated rows into 2 rows. But facing duplicate issues.

SELECT colm_A,colm_B, colm_C From Db.Table_A, UNNEST(Split(COLM_B,',')), COLM_B, UNNEST(SPLIT(COLM_C,',')) COLM_C;

Actual table

Colm_A Colm_B Colm_C
1 Fish F
2 Dog,cat D,C

Output from above query

Colm_A Colm_B Colm_C
1 Fish F
2 Dog D
2 Cat C
2 Dog D
2 Cat C

Expected output

Colm_A Colm_B Colm_C
1 Fish F
2 Dog D
2 Cat C
1

There are 1 answers

2
Mikhail Berlyant On

Use below approach

select Colm_A, Colm_B, Colm_C
from your_table, 
unnest(split(Colm_B)) Colm_B with offset
join unnest(split(Colm_C)) Colm_C with offset
using (offset)

if applied to sample data in your question - output is

enter image description here