MongoDB Index intersection example?

642 views Asked by At

I am trying to understand why MongoDB fails to use Index intersection as mentioned over here.

I have inserted 10000 documents in intersection collection using the below code block:

for (var i = 0; i < 10; i++) {
    for (var j = 0; j < 10; j++) {
        for (var k = 0; k < 10; k++) {
            for (var l = 0; l < 10; l++) {
              db.intersection.insert({a:i, b:j, c:k, d:l});
            }
        }
    }
}

Then created these 3 indexes:
db.intersection.createIndex({ a })
db.intersection.createIndex({ b : 1, c : 1 })
db.intersection.createIndex({ d : 1 })

At this point I was expecting db.intersection.find({a:1,b:2,d:4}) to use an intersection between the 3 indexes ie. a_1, b_1_c_1, d_1

However this isn't the case and I could see that the winning plan uses only one index, d_1 :

"winningPlan" : {
                        "stage" : "FETCH",
                        "filter" : {
                                "$and" : [
                                        {
                                                "a" : {
                                                        "$eq" : 1
                                                }
                                        },
                                        {
                                                "b" : {
                                                        "$eq" : 2
                                                }
                                        }
                                ]
                        },
                        "inputStage" : {
                                "stage" : "IXSCAN",
                                "keyPattern" : {
                                        "d" : 1
                                },
                                "indexName" : "d_1",
                                "isMultiKey" : false,
                                "multiKeyPaths" : {
                                        "d" : [ ]
                                },
                                "isUnique" : false,
                                "isSparse" : false,
                                "isPartial" : false,
                                "indexVersion" : 2,
                                "direction" : "forward",
                                "indexBounds" : {
                                        "d" : [
                                                "[4.0, 4.0]"
                                        ]
                                }
                        }
                },

Sorry I'm unable to post the allPlansExecution since it exceeds the body word limit

Additionally, winning plan for db.inter.find({a:1,b:2}) also uses just one index, b_1_c_1.

Can someone please provide an explanation for these results? Also a practical example demonstrating index intersection would be helpful.

1

There are 1 answers

2
Andrew Nessin On

Check this jira ticket on information about index intersection:

The query optimizer may select index intersection plans when the following conditions hold:

  1. Most of the documents in the relevant collection are disk-resident. The advantage of index intersection is that it can avoid fetching complete documents when the size of the intersection is small. If the documents are already in memory, there is nothing to gain by avoiding fetches.

  2. The query predicates are single point intervals, rather than range predicates or a set of intervals. Queries over single point intervals return documents sorted by disk location, which allows the optimizer to select plans that compute the intersection in a non-blocking fashion. This is generally faster than the alternative mode of computing the intersection, which is to build a hash table with the results from one index, and then probe it with the results from the second index.

  3. Neither of the indices to be intersected are highly selective. If one of the indices is selective then the optimizer will choose a plan which simply scans this selective index.
  4. The size of the intersection is small relative to the number of index keys scanned by either single-index solution. In this case the query executor can look at a smaller set of documents using index intersection, potentially allowing us to reap the benefits of fewer fetches from disk.

Apparently, mongodb can do better in most cases without using an index and it chooses to reject the intersection plan. It would be difficult to come up with an example which ensures that mongodb will use intersection.

For your example, if you see the rejectedPlans for the below query:

db.intersection.explain().find({a:1,b:2,d:4});

You'll find this as one of the plans (mongodb 3.4):

{
    "stage" : "FETCH",
    "filter" : {
        "$and" : [
            {
                "d" : {
                    "$eq" : 4
                }
            },
            {
                "a" : {
                    "$eq" : 1
                }
            },
            {
                "b" : {
                    "$eq" : 2
                }
            }
        ]
    },
    "inputStage" : {
        "stage" : "AND_SORTED",
        "inputStages" : [
            {
                "stage" : "IXSCAN",
                "keyPattern" : {
                    "d" : 1
                },
                "indexName" : "d_1",
                "isMultiKey" : false,
                "multiKeyPaths" : {
                    "d" : [ ]
                },
                "isUnique" : false,
                "isSparse" : false,
                "isPartial" : false,
                "indexVersion" : 2,
                "direction" : "forward",
                "indexBounds" : {
                    "d" : [
                        "[4.0, 4.0]"
                    ]
                }
            },
            {
                "stage" : "IXSCAN",
                "keyPattern" : {
                    "a" : 1
                },
                "indexName" : "a_1",
                "isMultiKey" : false,
                "multiKeyPaths" : {
                    "a" : [ ]
                },
                "isUnique" : false,
                "isSparse" : false,
                "isPartial" : false,
                "indexVersion" : 2,
                "direction" : "forward",
                "indexBounds" : {
                    "a" : [
                        "[1.0, 1.0]"
                    ]
                }
            }
        ]
    }
}

This (AND_SORTED stage) means mongodb did consider index intersection as a possibility, but concluded that d_1 index would perform much better.

Have a look at these answers too: here and here.