I am running some Map-Reduce-Jobs on an aws emr cluster with ~10 Nodes. (emr 4.7.11, m3.xlarge)
While the job is running the worker nodes start to die one by one after ~4 hours. In the logs I found the following error:
"1/3 local-dirs are bad: /mnt/yarn; 1/1 log-dirs are bad: /var/log/hadoop-yarn/containers"
The disks on the worker nodes were at 96% used when the Nodes failed. So I assume the disks on the nodes got to 100% and no files could be written to the disk.
So I tried to attach an 500GB EBS Volume to each instance. But Hadoop only uses /mnt
and does not use the additional Volume (/mnt2
).
How do i configure the AWS EMR Cluster to use /mnt2
?
I've tried to use a configuration file, but the cluster fails now with the error On the master instance (i-id), bootstrap action 6 returned a non-zero
on bootstrap.
unfortunately there are bootstrap action 6 log in the s3 bucket
The config file:
[
{
"Classification": "core-site",
"Properties": {
"hadoop.tmp.dir": "/mnt2/var/lib/hadoop/tmp"
}
},
{
"Classification": "mapred-site",
"Properties": {
"mapred.local.dir": "/mnt2/var/lib/hadoop/mapred"
}
}
]
Anyone has a hint why the cluster fails on startup ? Or is there another way to increase the initial EBS Volume of the m3.xlarge instances ?
https://forums.aws.amazon.com/thread.jspa?threadID=225588 Looks like the same issue but there is no solution
if the disk (like /mnt/) goes beyond 90% , then the core/task node will be marked unhealthy and unusable by YARN. See
yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage
in http://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-common/yarn-default.xmlNow , if you attach EBS volumes with EMR API(while you provision your cluster), then EMR does use those volumes for certain properties automatically. For example : mapred.local.dir will use all mounts. However, some properties like (hadoop.tmp.dir , yarn.nodemanager.log-dirs ) may not use all mounts. For such properties, you will need add a comma directory paths as values and set them using configurations API or manually editing necessary files.