I need to pass all collections of my database in MongoDB as input to Hadoop MR job. There is a method that allows multiple input:
MultiCollectionSplitBuilder mcsb = new MultiCollectionSplitBuilder();
mcsb.add(new MongoURI("mongodb://localhost:27017/mongo_hadoop.yield_historical.in"),
(MongoURI)null, // authuri
true, // notimeout
(DBObject)null, // fields
(DBObject)null, // sort
(DBObject)null, // query
false,
MultiMongoCollectionSplitter.class)
.add(new MongoURI("mongodb://localhost:27017/mongo_hadoop.yield_historical.in"),
(MongoURI)null, // authuri
true, // notimeout
(DBObject)null, // fields
(DBObject)null, // sort
new BasicDBObject("_id", new BasicDBObject("$gt", new Date(883440000000L))),
false, // range query
MultiMongoCollectionSplitter.class);
But I have arount 10 collections in my db. The above method allows only 2 collection arguements. All I need to do is get all collections in mapper methos alone. My Reducer will be the same for all of them.
Any help is appreciated.
You can continue to add to the MultiCollectionSplitBuilder