Building Apache Spark using SBT: Invalid or corrupt jarfile

6.9k views Asked by At

I'm trying to install Spark on my local machine. I have been following this guide. I have installed JDK-7 (also have JDK-8) and Scala 2.11.7. A problem occurs when I try to use sbt to build Spark 1.4.1. I get the following exception.

NOTE: The sbt/sbt script has been relocated to build/sbt.
      Please update references to point to the new location.

      Invoking 'build/sbt assembly' now ...

Attempting to fetch sbt
Launching sbt from build/sbt-launch-0.13.7.jar
Error: Invalid or corrupt jarfile build/sbt-launch-0.13.7.jar

I have searched for a solution to this problem. I have found a nice guide https://stackoverflow.com/a/31597283/2771315 which uses a pre-built version. Other than using the pre-built version, is there a way to install Spark using sbt? Further, is there a reason as to why the Invalid or corrupt jarfile error occurs?

2

There are 2 answers

2
yuxia On BEST ANSWER

I met the same problem. I have fixed it now.

This probably because sbt-launch-0.13.7.jar has a unsuccessful download, although you can see the file is exist, but it's not correct file. The file is about 1.2MB in size. If less than that, you can get into the build/ , use "vim sbt-launch-0.13.7.jar" or other tools to open sbt-launch-0.13.7.jar file.

If the file have the content like this:

<html>
<head><title>404 Not Found</title></head>
<body bgcolor="white">
<center><h1>404 Not Found</h1></center>
<hr><center>nginx</center>
</body>
</html>

It implys that sbt-launch-0.13.7.jar is not downloaded. Then open sbt-launch-lib.bash in the same directory,check the line 41 and 42, it gives two urls. Open it to check if they work well.

If url1 doesn't work,download the sbt-launch.jar manually(you can use url2, it may works,or you can download from sbt official website), put it in the same directory, rename it to sbt-launch-0.13.7.jar, then you shoud comment lines in relation to the downloading(may be between line 47 and 68), avoid the script download it again. Like this:

acquire_sbt_jar () {
  SBT_VERSION=`awk -F "=" '/sbt\.version/ {print $2}'    ./project/build.properties`
  URL1=http://repo.typesafe.com/typesafe/ivy-releases/org.scala-sbt/sbt-launch/${SBT_VERSION}/sbt-launch.jar
  URL2=http://repo.typesafe.com/typesafe/ivy-releases/org.scala-sbt/sbt-launch/${SBT_VERSION}/sbt-launch.jar
  JAR=build/sbt-launch-${SBT_VERSION}.jar

  sbt_jar=$JAR

 # if [[ ! -f "$sbt_jar" ]]; then
 #   # Download sbt launch jar if it hasn't been downloaded yet
 #   if [ ! -f "${JAR}" ]; then
 #   # Download
 #   printf "Attempting to fetch sbt\n"
 #   JAR_DL="${JAR}.part"
 #   if [ $(command -v curl) ]; then
 #     (curl --silent ${URL1} > "${JAR_DL}" || curl --silent ${URL2} > "${JAR_DL}") && mv "${JAR_DL}" "${JAR}"
 #   elif [ $(command -v wget) ]; then
 #     (wget --quiet ${URL1} -O "${JAR_DL}" || wget --quiet ${URL2} -O "${JAR_DL}") && mv "${JAR_DL}" "${JAR}"
 #   else
 #     printf "You do not have curl or wget installed, please install sbt manually from http://www.scala-sbt.org/\n"
 #     exit -1
 #   fi
 #   fi
 #   if [ ! -f "${JAR}" ]; then
 #   # We failed to download
 #   printf "Our attempt to download sbt locally to ${JAR} failed. Please install sbt manually from http://www.scala-sbt.org/\n"
 #   exit -1
 #   fi
 #   printf "Launching sbt from ${JAR}\n"
 # fi
 }

Then use "build/sbt assembly" to build the spark again.

Hope you will succeed.

If I didn't express clearly, the following links may be helpful.

https://www.mail-archive.com/[email protected]/msg34358.html

Error: Invalid or corrupt jarfile sbt/sbt-launch-0.13.5.jar the answer by prabeesh

https://groups.google.com/forum/#!topic/predictionio-user/fllCh8n-0d4

0
Sandeep Mishra On

Download the sbt-launch.jar file manually (you can use url2, it may work, or you can download from the sbt official website), put it in the same directory, rename it to sbt-launch-0.13.7.jar, then run the sbt/sbt assembly command.