I get the data in a predefined format I cannot change:
data =
{
"id": 12345,
"name": "Doe",
"gender": {
"textEn": "Masculin"
},
"professions": [
{
"job_description": {
"textEn": "Job description"
},
"cetTitles": [
{
"cetTitleType": {
"textEn": "Recognition"
},
"issuanceDate": "1992-04-14T00:00:00Z",
"phoneNumbers": [
"123 221 00 70"
]
}
]
}
]
}
While I can normalise data one level down ('gender') through the pd.json_normalize function, I struggle to access info further down in the hierarchy.
I tried e.g. getting the job description through
data = pd.json_normalize(data,record_path=['professions','job_description'],meta='id')
but I get a TypeError. How can I address this and extract all data into a table so the data above is represented as one row with all content as fields?
The output I would like to ultimately achieve looks as follows:
id | name | gender | job_description | cetTitleType | issuanceDate | phoneNumbers |
---|---|---|---|---|---|---|
12345 | Doe | Masculin | Job description | Recognition | 1992-04-14T00:00:00Z | 123 221 00 70 |
Using json_normalize():
Code:
Output: