Mongo database size inconsistency

1.6k views Asked by At

I use Mongo GridFS and I have a fairly big Mongo database currently dataSize is at 89GB when I use the db.stats() command.

When I create a mongo dump the directory size is 86GB in the file system and when I restore the database on another machine, and run db.stats() I now get 122GB.

Does anyone know what's the reason behind this 33GB rise in dataSize after a dump/restore?

Edit Here's the stats from initial database

MongoDB shell version: 2.4.5
connecting to: imgdb
rs0:PRIMARY> db.stats();
{
        "db" : "imgdb",
        "collections" : 4,
        "objects" : 2549884,
        "avgObjSize" : 37802.88397276111,
        "dataSize" : 96392968996,
        "storageSize" : 363433842080,
        "numExtents" : 207,
        "indexes" : 4,
        "indexSize" : 307245904,
        "fileSize" : 366974337024,
        "nsSizeMB" : 16,
        "dataFileVersion" : {
                "major" : 4,
                "minor" : 5
        },
        "ok" : 1
}

And here is the stats from restored database

MongoDB shell version: 2.6.4
connecting to: imgdb
dbdb.stats();
{
        "db" : "imgdb",
        "collections" : 4,
        "objects" : 2549924,
        "avgObjSize" : 51781.40103312883,
        "dataSize" : 132038637248,
        "storageSize" : 132281756768,
        "numExtents" : 98,
        "indexes" : 4,
        "indexSize" : 199976784,
        "fileSize" : 135159349248,
        "nsSizeMB" : 16,
        "dataFileVersion" : {
                "major" : 4,
                "minor" : 5
        },
        "extentFreeList" : {
                "num" : 0,
                "totalSize" : 0
        },
        "ok" : 1
}

Here are some thoughts for possible causes:

  1. For some reason I've got 40 more objects in the restored version!
  2. Different mongo versions, could that be the cause of how indexing algorithms have changed?
  3. Initial database was in a replicaset
  4. Initial database used to be at 320 GB, but I went in and compressed all the images and reduced it to 75GB a while back. That's why storage size on initial database is substantially higher
1

There are 1 answers

8
helmy On BEST ANSWER

MongoDB 2.6 uses Powers of Two Record Allocation by default.

Prior to loading your data, you can try either changing your mongod newCollectionsUsePowerOf2Sizes or collMod your collection:

db.runCommand( { collMod: "myCollection", usePowerOf2Sizes: false })