MongoDB sparse Index and Array: too many documents indexed

435 views Asked by At

I have a doubt on MongoDB sparse Index. I have a collection (post) with very little documents (6K the biggest) that could embed a sub-document in this way:

{
  "a": "a-val",
  "b": "b-val",
  "meta": {
     "urls": [ "url1", "url2" ... ],
     "field1": "value1",
     ...
  }
}

The field "a" and "b" are always presents, but "meta.urls" could be non existent!

Now, I have inserted just one document with "meta.urls" value and then I did

db.post.ensureIndex({"a": 1, "b": 1, "meta.urls": 1}, {sparse: true});

post stats gives me a "strange" result: the index is about 97MB! How is it possible? Only one document with "meta.urls" inserted, and index size is 97MB ?

So, I tried to create only "meta.urls" index in this way:

db.post.ensureIndex({"meta.urls": 1}, {sparse: true});

I have now "meta.urls_1" index with just 1 document. But if I explain a simple query like this

db.post.find({"meta.urls": {$exists: true}}).hint("meta.urls_1").explain({verbose: true});

I have another "strange" result:

"n" : 1,
"nscannedObjects" : 5,
"nscanned" : 5,

Why Mongo scans 5 docs, an not just the one in the index?

If I query for a precise match on "meta.urls", the single sparse index will work correctly.

Example: db.post.find({"meta.urls": "url1"}).hint("meta.old_slugs_1") // 1 document

1

There are 1 answers

0
Aaron Dufour On

For your first question: you can use a compound index to search on a prefix of the keys it indexes. For example, your first index would be used if you searched on just a or both a and b. Thus, the sparse will only fail to index docs where a is null.

I don't have an answer for your second question, but you should trying updating MongoDB and trying again - its moving pretty quickly, and sparse indexes have gotten better in the past few months.