I'm using the BulkWriteOperation (java driver) to store data in large chunks. At first it seems to be working fine, but when the collection grows in size, the inserts can take quite a lot of time.
Currently for a collection of 20M documents, bulk insert of 1000 documents could take about 10 seconds.
Is there a way to make inserts independent of collection size? I don't have any updates or upserts, it's always new data I'm inserting.
Judging from the log, there doesn't seem to be any issue with locks. Each document has a time field which is indexed, but it's linearly growing so I don't see any need for mongo to take the time to reorganize the indexes.
I'd love to hear some ideas for improving the performance
Thanks
Disk utilization & CPU: Check the disk utilization and CPU and see if any of these are maxing out. Apparently, it should be the disk which is causing this issue for you.
Mongo log: Also, if a 1000 bulk query is taking 10sec, then check for mongo log if there are any few inserts in the 1000 bulk that are taking time. If there are any such queries, then you can narrow down your analysis
Another thing that's not clear is the order of queries that happen on your Mongo instance. Is inserts the only operation that happens or there are other find queries that run too? If yes, then you should look at scaling up whatever resource is maxing out.