I have a dataframe with two different locations on it. I have been able to find their longitude and latitudes using geolocator
and using this old Stack Overflow Post. Now I've gotten stuck trying to find the distance between these two columns of locations. I've been following this website's information trying to us geodesic
as it directs.
The goal is to create a fifth and final column the shows me the distances between my locations. I'm getting an error that says:
ValueError: When creating a Point from sequence, it must not have more than 3 items.
I have created a fake dataset of only a few but please be aware my real dataset is very big so I need to replicate this over thousands of rows that will have NaNs in it. The treatment is the same. The logic may not make sense with how I had to create this fake dataset but it throws up the same error as my original dataset. The logic gets me where I need to go with my original data that possesses many more unique values for both columns of location values.
places_data = pd.DataFrame(
{"Place_1": ["Disneyland Park", "Empire State Building", "Yosemite Park", "Disney World Park", "Rockefeller Tower", "Grand Canyon"],
"Places": ["Peaches", "Apples", "Peaches", "Peaches", "Apples", "Peaches"]}
)
other_places = places_data.copy()
other_places.loc[(other_places["Places"] == "Peaches"), "Sites"] = "Georgia Aquarium"
other_places.loc[(other_places["Places"] == "Apples"), "Sites"] = "World of Coca-Cola"
other_places["Loc_1"] = other_places["Place_1"].apply(geolocator.geocode).apply(lambda x: (x.latitude, x.longitude))
other_places["Loc_2"] = other_places["Sites"].apply(geolocator.geocode).apply(lambda x: (x.latitude, x.longitude))
places_data['Loc_1'] = places_data.Place_1.map(dict(other_places[['Place_1','Loc_1']].to_numpy()))
places_data['Loc_2'] = places_data.Places.map(dict(other_places[['Places','Loc_2']].to_numpy()))
places_data["Distance"] = geodesic(places_data["Loc_1"], places_data["Loc_2"]).miles