So if I have a query that does the following (in pseudo code)

find(a nearby x, b > y).sort(c)

where a is a geo column, b is type of long, and c is also a type of long

Would the compound index on (a:2d, b:1, c:1) work and suggested?

1

There are 1 answers

9
Eve Freeman On BEST ANSWER

Geospatial queries have their own index category (as you mention), and the geohashing greatly improves the index performance of the first key lookup--it's better than a range if you can set it up right. In any case, I think your strategy will work: the key will be setting $maxDistance to something fairly small.

I inserted 10 million random geo records to match your description, like so:

{ "_id" : ObjectId("4f28e1cffc90631d239f8b5a"), "a" : [ 46, 47 ], "b" : ISODate("2012-02-01T06:53:25.543Z"), "c" : 19 }
{ "_id" : ObjectId("4f28e1bdfc90631d239c4272"), "a" : [ 54, 48 ], "b" : ISODate("2012-02-01T06:53:32.699Z"), "c" : 20 }
{ "_id" : ObjectId("4f28e206fc90631d23aac59d"), "a" : [ 46, 52 ], "b" : ISODate("2012-02-01T06:55:14.103Z"), "c" : 22 }
{ "_id" : ObjectId("4f28e1a7fc90631d23995700"), "a" : [ 54, 52 ], "b" : ISODate("2012-02-01T06:52:33.312Z"), "c" : 27 }
{ "_id" : ObjectId("4f28e1d7fc90631d23a0e9e7"), "a" : [ 52, 46 ], "b" : ISODate("2012-02-01T06:53:11.315Z"), "c" : 31 }

With the maxDistance at something below 10 the performance is really quite good.

db.test13.find({a:{$near:[50,50], $maxDistance:4}, b:{$gt:d}}).sort({c:1}).explain();
{
"cursor" : "GeoSearchCursor",
"nscanned" : 100,
"nscannedObjects" : 100,
"n" : 100,
"scanAndOrder" : true,
"millis" : 25,
"nYields" : 0,
"nChunkSkips" : 0,
"isMultiKey" : false,
"indexOnly" : false,
"indexBounds" : {

}
}

If you leave out maxDistance it starts to suffer quite a bit. Some of the queries took up to 60 seconds to run. The secondary range parameter doesn't seem to help much, even if the range is fairly narrow--it seems to be all about the maxDistance.

I recommend you play around with it to get a feel for how the geospatial index performs. Here is my test insert loop. You can try limiting the bits as well for less resolution

function getRandomTime() {
   return new Date(new Date() - Math.floor(Math.random()*1000000));
}

function getRandomGeo() {
   return [Math.floor(Math.random()*360-180),Math.floor(Math.random()*360-180)];
}

function initialInsert() {
   for(var i = 0; i < 10000000; i++) {
      db.test13.save({
         a:getRandomGeo(),
         b:getRandomTime(),
         c:Math.floor(Math.random()*1000)
      });
   }
}