How to run Spark locally on Windows using eclipse in java

5.1k views Asked by At

I'm trying to test Mllib's implementation of SVM. I want to run their java example locally on windows, using eclipse. I've downloaded Spark 1.3.1 pre-built for Hadoop 2.6 . When i try to run the example code, i get:

15/06/11 16:17:09 ERROR Shell: Failed to locate the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.

What should i change in order to be able to run the example code in this setup?

3

There are 3 answers

0
vishal rathod On
  1. Create the directory:

E:\hadoop_home\bin

  1. Download desired winutils.exe file from any hadoop-x.x.x/bin directory from the following github repo: https://github.com/steveloughran/winutils

  2. Place the downloaded winutils.exe file into

E:\hadoop_home\bin

directory which we created in step 1.

  1. Set the

hadoop.home.dir

in system property in the code ex:

import org.apache.spark.sql.SparkSession

object QuesCount {

  def main(args: Array[String]) = {

    System.setProperty("hadoop.home.dir", "E:\\hadoop_home")

  }
}
  1. Right click on your scala file Run As> Scala Application
0
snesneros On
  1. Create the following directory structure: "C:\hadoop_home\bin" (or replace "C:\hadoop_home" with whatever you like)

  2. Download the following file: http://public-repo-1.hortonworks.com/hdp-win-alpha/winutils.exe

  3. Put the file from step 2 into the "bin" directory from step 1.

  4. Set the "hadoop.home.dir" system property to "C:\hadoop_home" (or whatever directory you created in step 1, without the "\bin" at the end).

0
AudioBubble On

To run Spark in windows eclipse with Maven project:-

  1. Create Scala project and declare Scala object. Then convert the project into Maven (You should be having m2eclipse plugin installed for this - You can find this in Eclipse marketplace).
  2. pom.xml will be created., Please add the below dependency,

<dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-core_2.10</artifactId>
      <version>1.6.0</version>
</dependency>

  1. Now, build your project with some sample spark code(There should not be any errors)
  2. Now follow the below setup,

    • Create the following directory structure: "C:\hadoop_home\bin" (or replace "C:\hadoop_home" with whatever you like)
    • Download the following file: http://public-repo-1.hortonworks.com/hdp-win-alpha/winutils.exe
    • Put the file from step 2 into the "bin" directory from step 1.
    • Set the "hadoop.home.dir" system property to "C:\hadoop_home" (or whatever directory you created in step 1, without the "\bin" at the end). Note: You should be declaring this property in the beginning of your Spark code.

System.setProperty("hadoop.home.dir", "C://Users//Desktop//hadoop_home")