I have written a JMH benchmark for my MapReduce job. If I run my app in local mode, it works, but when I run it with the yarn script on my hadoop cluster, then I get the following error:
[cloudera@quickstart Desktop]$ ./launch_mapreduce.sh
# JMH 1.10 (released 5 days ago)
# VM invoker: /usr/java/jdk1.7.0_67-cloudera/jre/bin/java
# VM options: -Dproc_jar -Xmx1000m -Xms825955249 -Xmx825955249 -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/usr/lib/hadoop-yarn/logs -Dyarn.log.dir=/usr/lib/hadoop-yarn/logs -Dhadoop.log.file=yarn.log -Dyarn.log.file=yarn.log -Dyarn.home.dir=/usr/lib/hadoop-yarn -Dhadoop.home.dir=/usr/lib/hadoop-yarn -Dhadoop.root.logger=INFO,console -Dyarn.root.logger=INFO,console -Djava.library.path=/usr/lib/hadoop/lib/native
# Warmup: 5 iterations, 1 s each
# Measurement: 5 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: mgm.tp.bigdata.ma_mapreduce.MapReduceBenchmark.test
# Run progress: 0.00% complete, ETA 00:00:10
# Fork: 1 of 1
Error: Could not find or load main class org.openjdk.jmh.runner.ForkedMain
<forked VM failed with exit code 1>
<stdout last='20 lines'>
</stdout>
<stderr last='20 lines'>
Error: Could not find or load main class org.openjdk.jmh.runner.ForkedMain
</stderr>
# Run complete. Total time: 00:00:00
Benchmark Mode Cnt Score Error Units
my shell script is the following:
/usr/bin/yarn jar ma-mapreduce-benchmark.jar
and my benchmark options are:
public static void main(String[] args) throws Exception {
Options opt = new OptionsBuilder()
.include(MapReduceBenchmark.class.getSimpleName())
.warmupIterations(5)
.measurementIterations(5)
.forks(1)
.build();
new Runner(opt).run();
}
In my opinion jhm not wrok on hadoop cluster, because in each node of the cluster the benchmark want start a own jvm. That not work, the node communicate for the parallelization. First I measure the time for the execution of the program and repeat this, at the end I calculate the fault tolerance.