json_normalize in Python

7.6k views Asked by At

I am trying here to use json_normalize to somehow format the output of an API, but I keep getting a faulty and empty csv file. I tried to change df2 = pd.json_normalize(response, record_path=['LIST']) , but keep getting this error message:

TypeError: byte indices must be integers or slices, not str

Could you please guide me on what am I doing wrong ?

Thanks a lot !

import requests
import json
import pandas as pd

url = "https://*hidden*Results/"

payload = json.dumps({
  "id": 12345
})
headers = {
  'Authorization': 'Basic *hidden*',
  'Content-Type': 'application/json'
}

response = requests.request("POST", url, headers=headers, data=payload)


df1 = pd.DataFrame(response).iloc[:,:-2]
df2 = pd.json_normalize(response, record_path=None)

df = pd.concat([df1, df2], axis=1)
df.to_csv("test.csv", index=False)
1

There are 1 answers

7
anibal jasin On BEST ANSWER

You are passing the variable response in the call:

df2 = pd.json_normalize(response, record_path=None)

Which is an a requests.models.Response Object and you need to pass a dict, so you need to do something like pd.json_normalize(response.json(), record_path=['LIST'])

I tried it with this example and works:

>>> import pandas as pd
>>> data = [
...     {
...         "state": "Florida",
...         "shortname": "FL",
...         "info": {"governor": "Rick Scott"},
...         "counties": [
...             {"name": "Dade", "population": 12345},
...             {"name": "Broward", "population": 40000},
...             {"name": "Palm Beach", "population": 60000},
...         ],
...     },
...     {
...         "state": "Ohio",
...         "shortname": "OH",
...         "info": {"governor": "John Kasich"},
...         "counties": [
...             {"name": "Summit", "population": 1234},
...             {"name": "Cuyahoga", "population": 1337},
...         ],
...     },
... ]
>>> result = pd.json_normalize(data, ["counties"])
>>> result
         name  population
0        Dade       12345
1     Broward       40000
2  Palm Beach       60000
3      Summit        1234
4    Cuyahoga        1337

EDIT I will try to do this:

import requests
import json
import pandas as pd

url = "https://*hidden*Results/"

payload = json.dumps({
  "id": 12345
})
headers = {
  'Authorization': 'Basic *hidden*',
  'Content-Type': 'application/json'
}

response = requests.request("POST", url, headers=headers, data=payload)

json_response = response.json()

df1 = pd.DataFrame(json_response).iloc[:,:-2]
df2 = pd.json_normalize(json_response, record_path=['LIST'])

df = pd.concat([df1, df2], axis=1)
df.to_csv("test.csv", index=False)