Geopy function only returning some cities?

136 views Asked by At

I wrote a function using geopy to return the city from a set of coordinates containing latitude and longitude. However, the function only returned the city of about 10% of the entries. When I run the code on single entries it always returns the city so it’s nothing wrong with individual rows of data. Here is the function I wrote:


#importing libraries    
from tkinter import *.   
from geopy.geocoders import Nominatim.  
from geopy.geocoders import Photon

#Create an instance of tinker frame 
win = Tk()  

#Define geometry of the window  
win.geometry("700x350")  

#creating a function  
def get_city(coords):  
    #instantiate the Nominatim API  
    geolocator = Nominatim(user_agent="MyApp")  
    #get the city from the coordinates   
    location = geolocator.reverse(coords)  
    address = location.raw['address']   
    city = address.get('city', '')    
    #return the city   
    return city    


#applying function to dataframe 
irma['city'] = irma['coordinates'].apply(get_city)  
 

I was expecting the function to return the city for every row, but it only returned city for about 10% of the rows.

first five entries of dataframe showing city being returned for one row

1

There are 1 answers

0
Ehsan Hamzei On

This is because OSM attribute data is highly incomplete. Just checking the first coordinates in your data frame, we see that there is an 'address' key in the raw dictionary but it doesn't have 'city' - while it has 'town' and even 'road'. Maybe, in your case you actually want 'town' here.

This is my simple code to get the results for the first coordinates:

from geopy.geocoders import Nominatim

geolocator = Nominatim(user_agent="MyApp")  
coords = [27.49816, -80.37626]
location = geolocator. Reverse(cords) 
print(location. Raw['address'])

enter image description here

I suggest if you only need cities - use a local shapefile from official sources (city boundaries - usually available for each Census dataset) and simply use geopandas capability to spatial join using point-in-polygons (points in your dataframe that are inside the polygon of cities). gpd.sjoin(gpd_of_your_dataframe, city_polygon_df, op='within').

  • You need to create a geodataframe in geopandas from your current data frame
  • Then, you need to read a shapefile of city boundaries
  • Finally, you need to perform the spatial join

Would be much faster - no need for OSM API and will be highly accurate as you are using official datasets.