I have a dataset in Pandas format, which I have converted to cuDF for faster processing of location extraction from the latitude and longitude columns present in the dataset. The CPU code that I have is as follows:
import pandas as pd
from geopy.geocoders import Nominatim
from math import radians, sin, cos, sqrt, atan2
import time
cab_df = filtered_cab_data.copy()
geolocator = Nominatim(user_agent="reverse_geocoding_example")
def haversine_distance(lat1, lon1, lat2, lon2):
\# Convert latitude and longitude from degrees to radians
lat1, lon1, lat2, lon2 = map(radians, \[lat1, lon1, lat2, lon2\])
# Radius of the Earth in kilometers
R = 6371
# Haversine formula
dlat = lat2 - lat1
dlon = lon2 - lon1
a = sin(dlat / 2)**2 + cos(lat1) * cos(lat2) * sin(dlon / 2)**2
c = 2 * atan2(sqrt(a), sqrt(1 - a))
distance = R * c
return distance
def reverse_geocode(row):
lat = row\['pickup_latitude'\]
lon = row\['pickup_longitude'\]
retries = 3 # Number of retries
for \_ in range(retries):
try:
location = geolocator.reverse((lat, lon))
return location.address if location else None
except Exception as e:
print("Error:", e)
time.sleep(2) # Wait for a while before retrying
return None
cab_df\['pickup_location'\] = cab_df.apply(reverse_geocode, axis=1)
def reverse_geocode_dropoff(row):
lat = row\['dropoff_latitude'\]
lon = row\['dropoff_longitude'\]
retries = 3 # Number of retries
for \_ in range(retries):
try:
location = geolocator.reverse((lat, lon))
return location.address if location else None
except Exception as e:
print("Error:", e)
time.sleep(2) # Wait for a while before retrying
return None
cab_df\['dropoff_location'\] = cab_df.apply(reverse_geocode_dropoff, axis=1)
cab_df.head()'''
How can I modify this code to run on a GPU using a cuDF dataframe? I have attempted a few modifications with cuSpatial and cuProj, but all of them have resulted in TypeError errors.
I have explored various libraries and APIs (such as geopy, geopandas, nominatim etc.) used in Pandas for extracting location information from latitude and longitude data but with no success. Most are showing TypeError.
A couple thoughts for you: