i have json file with struct columns empty for 1 record and having values for another record. i want to read it in scala and crate dataframe. Code I am using:
var df: DataFrame = spark.read.option("multiline", "true").option("mode", "PERMISSIVE").json(path)
var updatedDf: DataFrame = {
if (null != parentNode && parentNode.trim.nonEmpty) {
df.select(explode(col(parentNode)).as(parentNode)).select(s"$parentNode.\*")}
else df }
println(parentNode)
Its working when I have consistent records but not working with below json file.
{
"result": [
{
"parent": "",
"roles": "",
"sys_created_by": "s466892"
},
{
"parent": {
"display_value": "AAT 01 Revenue Management and SkyCargo",
"link": "https://emiratesgroup.service-now.com/api/now/"
},
"roles": "",
"sys_created_by": "S331704"
}
]
}
expecting to load the data into a dataframe
You can use
from_jsonandto_jsonfunctions that are part of the Spark in-built function list on top of aschemathat is well-defined before you apply them:The result of the
finalDFis as follows: