I am trying a simple example, in which the output of one MapReduce job should be the input of another MapReduce job.
The flow should be like this: Mapper1 --> Reducer1 --> Mapper2 --> Reducer2
(The output of Mapper1 must be the input of Reducer1. The output of Reducer1 must be the input of Mapper2. The output of Mapper2 must be the input of Reducer2. The output of Reducer2 must be stored in output file).
How can I add multiple Mappers and Reducers to my program such that the flow is maintained like above?
Do I need to use Chain Mappers or Chain Reducers? If so how can I use them?
You need to implement two separate MapReduce jobs for that. The result of the first job needs to be written to some persistent storage (like HDFS) and will be read by the second job. The SequenceOutputFormat/InputFormat is often used for that. Both MapReduce jobs can be executed from the same driver program.