|-- x: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- y: long (nullable = true)
| | |-- z: array (nullable = true)
| | | |-- element: struct (containsNull = true)
| | | | |-- log: string (nullable = true)
I have the above nested schema where I want to change column z's log from string to struct.
|-- x: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- y: long (nullable = true)
| | |-- z: array (nullable = true)
| | | |-- element: struct (containsNull = true)
| | | | |-- log: struct (nullable = true)
| | | | | | |-- b: string (nullable = true)
| | | | | | |-- c: string (nullable = true)
I'm not using Spark 3 but Spark 2.4.x. Will prefer Scala way but python works too since this is a one time manual thing to backfill some past data.
Is there a way to do this with some udf or any other way?
I know it's easy to do this via from_json but the nested array of struct is causing issues.
Higher Order functions are your friend in this case. Coalesce basically. Code below