Algorithm to find k neighbors in a certain range?

1.1k views Asked by At

Suppose there is a point cloud having 50 000 points in the x-y-z 3D space. For every point in this cloud, what algorithms or data strictures should be implemented to find k neighbours of a given point which are within a distance of [R,r]? Naive way is to go through each of the 49 999 points for each of the 50 000 points and do a metric testing. But this approach will take large time. Just like there is kd tree to find nearest neighbour in small time so is there some real-time DS/algo implementation out there to pre-process the point clouds to achieve the goal inn shortest time?

4

There are 4 answers

0
Edward Doolittle On BEST ANSWER

Your problem is part of the topic of Nearest Neighbor Search, or more precisely, k-Nearest Neighbor Search. The answer to your question depends on the data structure you are using to store the points. If you use R-trees or variants like R*-trees, and you are doing multiple searches on your database, you will likely find a substantial performance improvement in two or three-dimensional space compared with naive linear search. In higher dimensions, space partitioning schemes tend to underperform linear search.

1
Michael Lorton On

If you are using an ordinary Euclidean metric, you could go through the list three times and extract those points that within R in each dimension, essentially extracting the enclosing cube. Searching the resulting list would still be O(n^2), but on a much smaller n.

4
Ely On

There are efficient algorithms (in average, for random data), see Nearest neighbor search.

Your approach is not efficient, yet simple.

Please read through, check you requirements and get back so we can help.

2
stefan On

As some answers already suggest for NN search you could use some tree algorithm like k-d-tree. There are implementations available for all programming languages.

If your description [R,r] suggests a hollow sphere you should compare one-time-testing (within interval) vs. two stages (test-for-outer and remove samples that pass test-for-inner).

You also did not mention performance requirements (timing or frame rate?) and your intended application (feasible approach?).