Scala - Error java.lang.NoClassDefFoundError: upickle/core/Types$Writer

366 views Asked by At

I'm new to Scala/Spark, so please be easy on me :)

I'm trying to run an EMR cluster on AWS, running the jar file I packed with sbt package. When I run the code locally, it is working perfectly fine, but when I'm running it in the AWS EMR cluster, I'm getting an error:

ERROR Client: Application diagnostics message: User class threw exception: java.lang.NoClassDefFoundError: upickle/core/Types$Writer

From what I understand, this error originates in the dependencies of the scala/spark versions.

So I'm using Scala 2.12 with spark 3.0.1, and in AWS I'm using emr-6.2.0.

Here's my build.sbt:

scalaVersion := "2.12.14"
libraryDependencies += "com.amazonaws" % "aws-java-sdk" % "1.11.792"
libraryDependencies += "com.amazonaws" % "aws-java-sdk-core" % "1.11.792"
libraryDependencies += "org.apache.hadoop" % "hadoop-aws" % "3.3.0"
libraryDependencies += "org.apache.hadoop" % "hadoop-common" % "3.3.0"
libraryDependencies += "org.apache.hadoop" % "hadoop-client" % "3.3.0"
libraryDependencies += "org.apache.spark" %% "spark-core" % "3.0.1"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "3.0.1"
libraryDependencies += "com.lihaoyi" %% "upickle" % "1.4.1"
libraryDependencies += "com.lihaoyi" %% "ujson" % "1.4.1"

What am I missing?

Thanks!

1

There are 1 answers

2
Alex Ott On

If you use sbt package, the generated jar will contain only the code of your project, but not dependencies. You need to use sbt assembly to generate so-called uberjar, that will include dependencies as well.

But in your cases, it's recommended to mark Spark and Hadoop (and maybe AWS) dependencies as Provided - they should be already included into the EMR runtime. Use something like this:

libraryDependencies += "org.apache.spark" %% "spark-core" % "3.0.1" % Provided