I loaded a JSON document in Spark, roughly, it looks like:
root
|-- datasetid: string (nullable = true)
|-- fields: struct (nullable = true)
...
| |-- type_description: string (nullable = true)
My DF is turning it into:
df = df.withColumn("desc", df.col("fields.type_description"));
All fine, but type_description
's value looks like: "1 - My description type".
Ideally, I'd like my df to contain only the textual part, e.g. "My description type". I know how to do that, but how can I make it through Spark?
I was hoping some along the line of:
df = df.withColumn("desc", df.col("fields.type_description").call(/* some kind of transformation class / method*/));
Thanks!