Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.CanSetDropBehind issue in ecllipse

Question

Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.CanSetDropBehind issue in ecllipse

7k views Asked by ragesh At 20 June 2015 at 17:12

I have the below spark word count program :

    package com.sample.spark;
    import java.util.Arrays;
    import java.util.List;
    import java.util.Map;
    import org.apache.spark.SparkConf;
    import org.apache.spark.api.java.*;
    import org.apache.spark.api.java.function.FlatMapFunction;
    import org.apache.spark.api.java.function.Function;
    import org.apache.spark.api.java.function.Function2;
    import org.apache.spark.api.java.function.PairFlatMapFunction;
    import org.apache.spark.api.java.function.PairFunction;
    import org.apache.hadoop.fs.FSDataInputStream;
    import org.apache.hadoop.fs.FSDataOutputStream;
    import scala.Tuple2;


    public class SparkWordCount {

        public static void main(String[] args) {
            SparkConf conf = new SparkConf().setAppName("wordcountspark").setMaster("local").setSparkHome("/Users/hadoop/spark-1.4.0-bin-hadoop1");
            JavaSparkContext sc = new JavaSparkContext(conf);
            //SparkConf conf = new SparkConf();
            //JavaSparkContext sc = new JavaSparkContext("hdfs", "Simple App","/Users/hadoop/spark-1.4.0-bin-hadoop1", new String[]{"target/simple-project-1.0.jar"});
            JavaRDD<String> textFile = sc.textFile("hdfs://localhost:54310/data/wordcount");
            JavaRDD<String> words = textFile.flatMap(new FlatMapFunction<String, String>() {
              public Iterable<String> call(String s) { return Arrays.asList(s.split(" ")); }
            });
            JavaPairRDD<String, Integer> pairs = words.mapToPair(new PairFunction<String, String, Integer>() {
                public Tuple2<String, Integer> call(String s) { return new Tuple2<String, Integer>(s, 1); }

            });

            JavaPairRDD<String, Integer> counts = pairs.reduceByKey(new Function2<Integer, Integer, Integer>() {
                  public Integer call(Integer a, Integer b) { return a + b; }
                });  
            counts.saveAsTextFile("hdfs://localhost:54310/data/output/spark/outfile");

        }


    }

I get the Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.CanSetDropBehind exception when I run the code from ecllipse however if I export as runnable jar and run from the terminal as below it works :

      bin/spark-submit --class com.sample.spark.SparkWordCount --master local  /Users/hadoop/spark-1.4.0-bin-hadoop1/finalJars/SparkJar-v2.jar

The maven pom looks like :

    <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
        <modelVersion>4.0.0</modelVersion>
        <groupId>com.sample.spark</groupId>
        <artifactId>SparkRags</artifactId>
        <packaging>jar</packaging>
        <version>1.0-SNAPSHOT</version>
        <name>SparkRags</name>
        <url>http://maven.apache.org</url>
        <dependencies>
            <dependency>
                <groupId>junit</groupId>
                <artifactId>junit</artifactId>
                <version>3.8.1</version>
                <scope>test</scope>
            </dependency>
            <dependency> <!-- Spark dependency -->
                <groupId>org.apache.spark</groupId>
                <artifactId>spark-core_2.10</artifactId>
                <version>1.4.0</version>
                <scope>compile</scope>
            </dependency>
            <dependency>
                <groupId>org.apache.hadoop</groupId>
                <artifactId>hadoop-common</artifactId>
                <version>0.23.11</version>
                <scope>compile</scope>
            </dependency>
            <dependency>
                <groupId>org.apache.hadoop</groupId>
                <artifactId>hadoop-core</artifactId>
                <version>1.2.1</version>
                <scope>compile</scope>
            </dependency>
    </dependencies>
    </project>

Original Q&A

There are 2 answers

ragesh On 23 June 2015 at 23:22

Found that this was caused because the hadoop-common jar for the version 0.23.11 did not have the class,changed the version to 2.7.0 and also added below dependency :

    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-mapreduce-client-core</artifactId>
        <version>2.7.0</version>
    </dependency>

Got rid of the error then but still seeing the below error :

Exception in thread "main" java.io.EOFException: End of File Exception between local host is: "mbr-xxxx.local/127.0.0.1"; destination host is: "localhost":54310; : java.io.EOFException; For more details see: http://wiki.apache.org/hadoop/EOFException

**Ramzy** · Accepted Answer · 2015-06-20T18:07:32+00:00

When you run in eclipse, the referenced jars are the only source for your program to run. So the jar hadoop-core(thats where CanSetDropBehind is present), is not added properly in your eclipse from local repository for some reasons. You need to identify this if it is a proxy issue, or any other with pom.

When you run the jar from terminal, the reason for running can be, due to the presence of jar in the classpath referenced. Also while running from terminal, you could also choose to have those jars as fat jar(to include hadoop-core) in your jar. I hope you are not using this option while creating your jar. Then the reference would be picked from inside your jar, without depending on the class path.

Verify each step, and it will help you identify the cause. Happy coding

TechQA.

Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.CanSetDropBehind issue in ecllipse

There are 2 answers

Related Questions in MAVEN

Related Questions in HADOOP

Related Questions in APACHE-SPARK

Related Questions in WORD-COUNT

Popular Questions

Popular Tags

Trending Questions