I'm trying to use ckdTree to find all of the data points within a specified distance (1500 m). I have a dataframe of centres, and a dataframe of raw data. My plan was to use the x and y coordinates extracted from the clusters to build a new dataframe of the data points that meet specific criteria. Here's what I have:
import numpy as np
import scipy.spatial as spatial
import matplotlib.pyplot as plt
points = perfed[['X', 'Y']].values
centres = producers[['X', 'Y']].values
x_list = []
y_list = []
point_tree = spatial.cKDTree(points)
cmap = plt.get_cmap('rainbow')
colors = cmap(np.linspace(0, 1, len(centres)))
for center, group, color in zip(centres, point_tree.query_ball_point(centres, 1500), colors):
cluster = point_tree.data[group]
x, y = cluster[:, 0], cluster[:, 1]
plt.scatter(x, y, c=color, s=10)
d = {'X': [x_list],
'Y': [y_list]}
output = pd.DataFrame.from_dict(d,orient='index').transpose()
# output = output.merge(producers, how='left', left_on='X', right_on='X')
The input dataset is just UTM x and y coordinates. Can anyone spot where I'm making the mistake? Thanks!
A coworker found this solution. It could be likely done in less lines, but it works.