How do MongoDB multi-keys sort?

2.3k views Asked by At

In MongoDB, a field can have multiple values (an array of values). Each of them is indexed, so you can filter on any of the values. But can you also "order by" a field with multiple values and what is the result?

Update:

> db.test.find().sort({a:1})
{ "_id" : ObjectId("4f27e36b5eaa9ebfda3c1c53"), "a" : [ 0 ] }
{ "_id" : ObjectId("4f27e3845eaa9ebfda3c1c54"), "a" : [ 0, 1 ] }
{ "_id" : ObjectId("4f27df6e5eaa9ebfda3c1c4c"), "a" : [ 1, 1, 1 ] }
{ "_id" : ObjectId("4f27df735eaa9ebfda3c1c4d"), "a" : [ 1, 1, 2 ] }
{ "_id" : ObjectId("4f27df795eaa9ebfda3c1c4e"), "a" : [ 2, 1, 2 ] }
{ "_id" : ObjectId("4f27df7f5eaa9ebfda3c1c4f"), "a" : [ 2, 2, 1 ] }
{ "_id" : ObjectId("4f27df845eaa9ebfda3c1c50"), "a" : [ 2, 1 ] }
{ "_id" : ObjectId("4f27e39a5eaa9ebfda3c1c55"), "a" : [ 2 ] }

With unequal length arrays the longer array is "lower" than the shorter array

So, why is [0] before [0,1], but [2] after [2,1] ? Is maybe sorting only done on the first array element? Or the lowest one? And after that it is insertion order?

Also, how is this implemented in the case of an index scan (as opposed to a table scan)?

1

There are 1 answers

10
Remon van Vliet On BEST ANSWER

Sorting of array elements is pretty complicated. Since array elements are indexed seperately sorting on an array field will actually result in some interesting situations. What happens is that MongoDB will sort them based on the lowest or highest value in the array (depending on sort direction). Beyond that the order is natural.

This leads to things like :

> db.test.save({a:[1]})
> db.test.save({a:[0,2]})
> db.test.find().sort({a:1})
{ "_id" : ObjectId("4f29026f5b6b8b5fa49df1c3"), "a" : [ 0, 2 ] }
{ "_id" : ObjectId("4f2902695b6b8b5fa49df1c2"), "a" : [ 1 ] }
> db.test.find().sort({a:-1})
{ "_id" : ObjectId("4f29026f5b6b8b5fa49df1c3"), "a" : [ 0, 2 ] }
{ "_id" : ObjectId("4f2902695b6b8b5fa49df1c2"), "a" : [ 1 ] }

In other words. The same order for reversed sorts. This is due to the fact that the "a" field of the top document holds both the lowest and the highest value.

So effectively for the sort MongoDB ignores all values in the array that are not either the highest ({field:-1} sort) or the lowest ({field:1} sort) and orders the remaining values.

To paint an (oversimplified) picture it works something like this :

flattened b-tree for index {a:1} given above sample docs :

"a" value 0 -> document 4f29026f5b6b8b5fa49df1c3
"a" value 1 -> document 4f2902695b6b8b5fa49df1c2
"a" value 2 -> document 4f29026f5b6b8b5fa49df1c3

As you can see scanning from both top to bottom and bottom to top will result in the same order.

Empty arrays are the "lowest" possible array value and thus will appear at the top and bottom of the above queries respectively.

Indexes do not change the behaviour of sorting on arrays.