How do we allocate different number of reducers of heterogeneous clusters?

58 views Asked by At

Our system has a cluster of 5 hosts (e.g., data node or computer slaves…). Now, I want allocate different number of reducers of these hosts because 1 host is slow. . We are using Hadoop Yarn. The resource manager (so called Job tracker in MapReduce1) always allocate evenly number of reducers of to 5 hosts. Is there anyway that I can limit number of reducers of a specific host? For example, what I want is that a job with 40 reducers, 4 fast computers have 36 reducers (e.g., 9 reducers each host), the slow computer has only 4 reducers.

1

There are 1 answers

0
Vijay Bhoomireddy On

It is entirely possible and a common phenomenon to have heterogenous systems in a hadoop cluster. Typically, as the cluster keeps becoming larger and hence is scaling horizontally, new nodes of different configurations get added to the cluster.

In such scenarios, in order to have configurations applicable to a specific node or to a group of nodes, we need to modify the configurations accordingly on those hosts.

For example, in case of Hortonworks Data Platform where the cluster is managed through Ambari, the concept of host config groups can be leveraged for this purpose.

Please see the below link for further information:

https://docs.hortonworks.com/HDPDocuments/Ambari-2.1.1.0/bk_Ambari_Users_Guide/content/_using_host_config_groups.html

Also see the below link, where the discussion is about increasing the number of YARN containers at a node level. It remains the same in your case as well, which is the opposite of the use case discussed there:

How to increase the number of containers in nodemanager in YARN

Another useful link:

http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/