Convert patentsview API data returned as nested JSON into a pandas dataframe

156 views Asked by At

I am trying to convert a JSON derived from the patentsview API into a pandas dataframe. However it is difficult because it seems to be a nested JSON!

Here is my code:

import requests
import pandas as pd
from pandas.io.json import json_normalize
import json


url = 'http://www.patentsview.org/api/patents/query?q={"cpc_group_id":"B60W"}&f=["inventor_first_name","inventor_last_name","patent_number", "assignee_country"]'
r = requests.get(url)
json_data = r.json()

df = pd.DataFrame(json_data['patents'])
df

See the image for the dataframe that is returned.

1

My question is how can I get the nested dictionary keys and their values into unique columns?

1

There are 1 answers

0
Peter Leimbigler On
result = pd.DataFrame.from_records(df.assignees.apply(lambda x: x[0]))
inventors = pd.DataFrame.from_records(df.inventors.apply(lambda x: x[0]))
result = pd.concat([result, inventors, df.patent_number], axis=1)

result.head()
  assignee_country assignee_key_id inventor_first_name inventor_key_id  \
0             None            None        Robert Cecil            4848   
1               GB           82078        Anthony John           16057   
2             None            None            James W.           16376   
3               FR          281289              Gilles           18482   
4               JP          301319            Kiyoharu           18507   

  inventor_last_name patent_number  
0              Clerk       3932991  
1               Adey       3939738  
2             Moberg       3939937  
3            Leconte       3941203  
4           Murakami       3941223