I am using patent-client in order to look for the patent number of 250000 doc-numbers. My original data look like this:
invention_title doc_number date
0 Doughnut product with six appendages 29327507 2008
1 Doughnut product with six appendages and witho... 29327632 2008
2 Doughnut product with six appendages and witho... 29327637 2008
3 Meat piece 29298838 2007
4 Pet treat 29320494 2008
...
I am trying to use patent-client in order to retrieve the patent number of each observation starting from the doc-number (column doc-number) as follows:
# Import the model classes you need
from patent_client import Inpadoc, Assignment, USApplication
# Fetch US Applications
app={}
patent_x=[]
publn_nr=[]
for i in range(len(df_all)):
try:
app[i] = USApplication.objects.get(df_all['doc_number'][i])
except:
pass
try:
patent_x.append(app[i].patent_number)
except:
patent_x.append('')
try:
publn_nr.append(app[i].publication_number)
except:
publn_nr.append('')
df_all['patent_x']=patent_x
df_all['publn_nr']=publn_nr
However this code takes a huge amount of time (while if I look for single doc-number one at the time the procedure seems very fast). Why so? Is there a way I can improve the speed of the process?
Thank you