Performance of RTree vs kd-trees

4.6k views Asked by At

I have around 10 K points in 5 dimensional space. We can assume that the points are randomly distributed in space (0,0,0,0,0) and (100,100,100,100,100). Clearly, the whole data set can easily reside in memory.

I would like to know which algorithm for k nearest neighbour would run faster, kd-tree or RTree.

Although I have some very high level idea of these two algorithms, I am not sure which will run faster, and why. I am open to exploring other algorithms if any, which could run fast. Please, if possible, specify why an algorithm may run faster.

1

There are 1 answers

0
Has QUIT--Anony-Mousse On

This depends on various parameters. Most importantly on your capability to implement these algorithms.

I've personally found bulk-loaded R*-trees to be faster for large data, probably because they have a better fan-out. Bulk-loaded R-trees is a more fair comparison, as kd-trees are commonly bulk-loaded (in fact, they don't support incremental operation very well at all).

For tiny data, kd-trees will likely be faster, plus they are much simpler to implement.

For other things, please refer to this earlier question / answer:

https://stackoverflow.com/a/11109467/1060350