I have a scenario where I am not sure what the location reduce processors are to occur on.
i) I have an input text file, it has a 1000's of integers in a balanced range between 1 and 4.
ii) Let us suppose there is a 4 node cluster each node with 12 slots, of which 4 are allocated as reducers - giving us 16 total reduce slots
iii) I have set the number of reducers in the driver:
jobConf.setNumReduceTasks(4);
iii) And finally given I have a partitioner method that is
public class MyPartitioner extends Partitioner<Text,Text>{
@Override
public int getPartition(Text key, Text value, int numPartitions) {
return Integer.parseInt(key.toString());
}
}
1) i. How do I force it to process the reduce using 1 reducer on each node (leaving the 3 other local reducers idle) rather than more than one reducer running on each node IE how can you ensure that you don't use 4 slots on one Node and have 12 slots on Nodes 2,3&4 idle.
ii. Does Hadoop MR manage resource to say: "Node X is the most idle, I'll spawn an reducer there..."
2) if you have skew on a key but intend to group on that, can you spawn multiple reducers for that key, eg add a random integer to a seed value of the value "4" and add 3 additional reducers using the partitioner to process value "4" in reducer 4,5,6 and 7?
jobConf.setNumReduceTasks(7);
and
public class MyPartitioner2 extends Partitioner<Text,Text>{
@Override
public int getPartition(Text key, Text value, int numPartitions) {
int p = Integer.parseInt(key.toString());
if (p == 4){return p + (new Random()).nextInt(4);}//extra 3 partitions...
return p;
}
}
Would that work for skew?
This isn't something you can control - the assignment of map and reducer tasks to nodes is handled by the JobTracker.
There's an O'Reilly Answer detailing the specifics of Task Assignment in a good amount of detail:
http://answers.oreilly.com/topic/459-anatomy-of-a-mapreduce-job-run-with-hadoop/
The default behaviour is to assign one task per update iteration of the Job Tracker so you shouldn't typically see all reduce tasks being satisfied by the same node - but if your cluster is busy with other tasks and only a single node has available reducer slots then all your reduce tasks may get tasked to that node.
As for handling skew, this will alleviate all data for a single known high volume key possibly being sent to a single node (again there is no guarantee of this), but you'll still have a problem that you'll need to combine the three reducer outputs for this skew key into the final answer.