I'm working on a flocking boids simulation just for fun, and I want to optimise it a bit. The area that needs work is finding boids near a given boid. I figure that to do that some kind of spatial data-structure suited to the task would be my best bet (see here and scroll down a bit.).
Whatever I go with, I'll implement myself, from scratch, in Java. That way I'll learn more about the data structure I choose than I would if I just called a bunch of library functions.
I'm aware of R-Trees, k-d trees, and Quadtrees. They're all feasible options, in my opinion. But I don't have any experience with these data structures and I'm not totally sure what best suits my purpose. I don't need anything on this scale - I'm talking maybe a few hundred boids, perhaps at most one thousand, rather than a million, although bear in mind I might end up running it on an Android phone eventually.
Please recommend me a data structure (not limited to the above, of course) for this, and give me a good reason to choose it over the alternatives.
Yes, I've seen this question. No, I'm not satisfied with the answer - there's no reasoning given at all.
Oh, one other thing - like the title says, this is strictly for two dimensions only.
Nine years too late to help Iskar Jarak but I will chime in since others may find this old question in a search.
The key point is that almost anything is better than the naive O(n²) approach of looking at every pair of boids. So as Iskar suggests, any kind of tree based spatial data structure is a vast improvement, being basically O(n log n). That is: each of the n boids needs to look up its neighbors each of which is O(log n).
It should be noted that “hit detection” mentioned by Desmond Vehar is a slightly different problem, asking “is this query position ‘inside’ any of the ‘objects’ in my data structure?” In multi-agent simulations we want to find multiple neighbors. Sometimes “nearest N neighbors” or perhaps “all neighbors within a given radius of the query position.” It usually is sufficient to describe a boid as its center point and filter by distance between points, producing a “query sphere.”
In his algorithms courses Tim Roughgarden keeps asking “can we do better?” And indeed—while O(log n) is better than O(n) to look up a neighbor—the best is constant time lookup, O(1). Here hash tables (aka unsorted maps) are our friends.
These two ideas, multiple neighbors and hashing, lead to a good approach for accelerating multi-agent simulations: spatial hashing into voxels(/boxes/lattices). Each voxel contains a collection of boids/agents. The continuous space position (usually 2 or 3 floats) of the agent is converted into voxel coordinates (2 or 3 ints, by a scaled “floor” operation), which are hashed together to produce a “voxel ID” which is used as a key in a hash table of voxels. So in O(1) constant time, like an array reference, you can look up the voxel which contains a collection of all agents currently in the same voxel. The “query sphere” will normally overlap multiple voxels. These can be found by offsetting the query point by multiples of the voxel spacing distance. The contents of these several voxels are merged together to form a collection of potential neighbors, then filtered by distance. As agents/boids move, they compare their old and new “voxel ID” and if not equal, they remove themself from the data structure, then reinsert at the new position.
Query sphere in voxel space:
Resources:
There are roughly a zillion types of “spatial data structure”/“spatial database.” See this Wikipedia article for a survey: Spatial database. See also Hanan Samet’s early papers: 1989’s Hierarchical Spatial Data Structures and 1995’s Spatial Data Structures.
In my own work in the early 2000s, I used fixed sized collection of voxels, addressed with i,j,k coordinates: Interaction with Groups of Autonomous Characters in 2000, and Big Fast Crowds on PS3 in 2006. Later, I switched to using the “spatial hashing into voxels” approach, which does not require a priori specification of the 3d voxel grid dimensions, and has zero overhead for sparse agent distribution with many empty voxels.
[Edit in 2023: I should have mentioned: Nießner, Zollhöfer, Izadi, Stamminger. 2013. Real-time 3D Reconstruction at Scale using Voxel Hashing. ACM TOG]