I use Mongo GridFS and I have a fairly big Mongo database currently dataSize is at 89GB when I use the db.stats()
command.
When I create a mongo dump the directory size is 86GB in the file system and when I restore the database on another machine, and run db.stats()
I now get 122GB.
Does anyone know what's the reason behind this 33GB rise in dataSize after a dump/restore?
Edit Here's the stats from initial database
MongoDB shell version: 2.4.5
connecting to: imgdb
rs0:PRIMARY> db.stats();
{
"db" : "imgdb",
"collections" : 4,
"objects" : 2549884,
"avgObjSize" : 37802.88397276111,
"dataSize" : 96392968996,
"storageSize" : 363433842080,
"numExtents" : 207,
"indexes" : 4,
"indexSize" : 307245904,
"fileSize" : 366974337024,
"nsSizeMB" : 16,
"dataFileVersion" : {
"major" : 4,
"minor" : 5
},
"ok" : 1
}
And here is the stats from restored database
MongoDB shell version: 2.6.4
connecting to: imgdb
dbdb.stats();
{
"db" : "imgdb",
"collections" : 4,
"objects" : 2549924,
"avgObjSize" : 51781.40103312883,
"dataSize" : 132038637248,
"storageSize" : 132281756768,
"numExtents" : 98,
"indexes" : 4,
"indexSize" : 199976784,
"fileSize" : 135159349248,
"nsSizeMB" : 16,
"dataFileVersion" : {
"major" : 4,
"minor" : 5
},
"extentFreeList" : {
"num" : 0,
"totalSize" : 0
},
"ok" : 1
}
Here are some thoughts for possible causes:
- For some reason I've got 40 more objects in the restored version!
- Different mongo versions, could that be the cause of how indexing algorithms have changed?
- Initial database was in a replicaset
- Initial database used to be at 320 GB, but I went in and compressed all the images and reduced it to 75GB a while back. That's why storage size on initial database is substantially higher
MongoDB 2.6 uses Powers of Two Record Allocation by default.
Prior to loading your data, you can try either changing your mongod newCollectionsUsePowerOf2Sizes or
collMod
your collection: