adding a column to a pandas dataframe, based on dictionary key

514 views Asked by At

I have a following dataframe:

id  ip  
1   219.237.42.155
2   75.74.144.120
3   219.237.42.155

By using maxmindb-geolite2 package, I can find out what city a specific ip is assigned to. The following code:

from geolite2 import geolite2
reader = geolite2.reader()
reader.get('219.237.42.155')

will return a dictionary, and by looking up keys, I can actually get a city name:

reader.get('219.237.42.155')['city']['names']['en']

returns:

'Beijing'

The problem I have is that I do not know how to get the city for each ip in the dataframe and put it in the third column, so the result would be:

id  ip              city
1   219.237.42.155  Beijing
2   75.74.144.120   Hollywood
3   219.237.42.155  Beijing

The farthest I got was mapping the whole dictionary to a separate column by using the code:

df['city'] = df['ip'].apply(lambda x: reader.get(x))

On the other hand:

df['city'] = df['ip'].apply(lambda x: reader.get(x)['city']['names']['en'])

throws a key error.. What am I missing?

1

There are 1 answers

0
Allen Qin On
#you can use apply to check if the key exists before trying to access its values.
df.apply(lambda x: reader.get(x.ip,np.nan),axis=1).apply(lambda x: np.nan if pd.isnull(x) else x['city']['names']['en'])
Out[39]: 
0    Beijing
1        NaN
2    Beijing
dtype: object