I'm new to Hadoop and currently I'm learning mapreduce design pattern from Donald Miner & Adam Shook MapReduce Design Pattern book. So in this book there is Cartesian Product Pattern. My question is:
- When does record reader send data to mapper?
- Where is the code that send the data to mapper?
What I see is next function in CartesianRecordReader class read both split without sending the data.
Here is the source code https://github.com/adamjshook/mapreducepatterns/blob/master/MRDP/src/main/java/mrdp/ch5/CartesianProduct.java
That's all, thanks in advance :)
Let me answer by giving you an idea how how the mapper and the RecordReader are related. This is the Hadoop code that sends data to the mapper. 1
Basically, the Hadoop will call
next
until it returnsfalse
, and at every callkey
andvalue
will obtain new values.Key
being normally the bytes read so far andvalue
the next line in the file.That code is at the source code of hadoop (Probably at the MapContextImpl class) but it resembles what I have wrote in the code snippet.EDIT : The source code is at MapRunner.