I hace a big json file data and I want to convert it in to tabular form. I am trying to flatten the data in to dataframe using json_nomalise. so Far I have this :
I want to further flatten the submissions and product data in columns i tried this:
submission_data = pd.json_normalize(data=rawData['results'], record_path=rawData['results']['submissions'], meta=['application_number', 'sponsor_name'] , errors='ignore') submission_data.head(3)
But I am getting error saying: TypeError: list indices must be integers or slices, not str
Any output on this will be helpful
As submissions and Products are lists (and not objects with a regular structure), JSON_normalize will leave them untouched. Also, given that they are lists, can you make sure that they are always the same number for each record? If not, distributing them trough columns makes no sense. If submissions and products are pairs (i.e. if every submission corresponds to one product) you can consider distributing along lines (In a melting dataframe strategy).
finally, regarding the error, raw_data seems to be a list of objects that contain a 'results' field. That means you cannot retrieve directly raw_data['results'], but only raw_data[0]['results'] to get the results from the first object.
Adding a solution proposition
Given your data structure, what I would do is the following:
Repeat the process for the 'products'; however, unless you know the relationship between submissions and products, there is no clear way of merging the dataframes you get:
in code: