Mongodump using DBRef criteria and indexes?

408 views Asked by At

I want to dump some part of a database, especially some part of a collection using the -q / --query option. Here is what I'm doing:

mongodump --host ... -o ... -q "{ pipe: DBRef(\"pipe\", ObjectId($2)) }"

The dump is operating but is strongly slow. There are 3M objects, and there is an index on the pipe attribute, so things shouldn't be as slow. It looks like the query is scanning the whole collection.

Any ideas ?

Thanks

1

There are 1 answers

0
attish On

You can switch on profiler and see what is happening. Basicly when you specifiy a query you dump will only run this query and convert the answer to the dump. First i thougth it will snapshot the query but no.

With profiler i had the following results :

for this : mongodump -h localhost:27417 -u ... -p ... -d test1 -c t

connected to: localhost:27417
Fri Aug 30 11:47:56.288 DATABASE: test1  to     dump/test1
Fri Aug 30 11:47:56.289     test1.t to dump/test1/t.bson
Fri Aug 30 11:47:56.291          101 objects
Fri Aug 30 11:47:56.291     Metadata for test1.t to dump/test1/t.metadata.json

I had this in the profiler:

{
    "op" : "query",
    "ns" : "test1.t",
    "query" : {
        "query" : {

        },
        "$snapshot" : true
    },
    "ntoreturn" : 0,
    "ntoskip" : 0,
    "nscanned" : 101,
    "keyUpdates" : 0,
    "numYield" : 0,
    "lockStats" : {
        "timeLockedMicros" : {
            "r" : NumberLong(499),
            "w" : NumberLong(0)
        },
        "timeAcquiringMicros" : {
            "r" : NumberLong(2),
            "w" : NumberLong(4)
        }
    },
    "nreturned" : 101,
    "responseLength" : 11410,
    "millis" : 0,
    "ts" : ISODate("2013-08-30T09:54:18.605Z"),
    "client" : "127.0.0.1",
    "allUsers" : [
        {
            "user" : "__system",
            "userSource" : "local"
        }
    ],
    "user" : "__system@local"
}

For this: mongodump -h localhost:27417 -u ... -p ... -d test1 -c t -q {\'b.b\':8}, i had this:

connected to: localhost:27417
Fri Aug 30 11:58:15.332 DATABASE: test1  to     dump/test1
Fri Aug 30 11:58:15.334     test1.t to dump/test1/t.bson
Fri Aug 30 11:58:15.335          1 objects
Fri Aug 30 11:58:15.336     Metadata for test1.t to dump/test1/t.metadata.json

In the profiler:

{
    "op" : "query",
    "ns" : "test1.t",
    "query" : {
        "b.b" : 8
    },
    "ntoreturn" : 0,
    "ntoskip" : 0,
    "nscanned" : 1,
    "keyUpdates" : 0,
    "numYield" : 0,
    "lockStats" : {
        "timeLockedMicros" : {
            "r" : NumberLong(174),
            "w" : NumberLong(0)
        },
        "timeAcquiringMicros" : {
            "r" : NumberLong(3),
            "w" : NumberLong(4)
        }
    },
    "nreturned" : 1,
    "responseLength" : 134,
    "millis" : 0,
    "ts" : ISODate("2013-08-30T09:58:15.335Z"),
    "client" : "127.0.0.1",
    "allUsers" : [
        {
            "user" : "__system",
            "userSource" : "local"
        }
    ],
    "user" : "__system@local"
}

That means it utilizes the index what i specified so some other thing probably the bottleneck or the query is not supported by the index you specified.