How to split an array for KNN?

206 views Asked by At

I'm using the brute-force algorithm for KNN to find nearest neighbors in my web service. One downside of this approach is that I need enough memory on every machine to load the whole array for KNN. Now I'm thinking of splitting the array, do KNN separately on many machines, and then merge the results using merge sort. But that would be slow if the client need to create a lot of connections to query each part of the result.

I have read about algorithms like KDTree and Balltree in the documentation of sklearn, I wonder if I can somehow separate the array so that I can quickly check which part of the array I need to look into to find nearest neighbors.

0

There are 0 answers