java: 1.8,sbt: 1.9,scala: 2.12
I have a very simple repo with the following dependency in build.sbt
libraryDependencies ++= Seq("org.apache.spark" %% "spark-connect-client-jvm" % "3.5.0")
A simple application
object Main extends App {
val s = SparkSession.builder().remote("sc://localhost").getOrCreate()
s.read.json("/tmp/input.json").repartition(10).show(false)
}
But when I run it, I get the following error
Exception in thread "main" java.lang.NoClassDefFoundError: org/sparkproject/connect/client/com/google/common/cache/CacheLoader
at Main$.delayedEndpoint$Main$1(Main.scala:4)
at Main$delayedInit$body.apply(Main.scala:3)
at scala.Function0.apply$mcV$sp(Function0.scala:39)
at scala.Function0.apply$mcV$sp$(Function0.scala:39)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
at scala.App.$anonfun$main$1$adapted(App.scala:80)
at scala.collection.immutable.List.foreach(List.scala:431)
at scala.App.main(App.scala:80)
at scala.App.main$(App.scala:78)
at Main$.main(Main.scala:3)
at Main.main(Main.scala)
Caused by: java.lang.ClassNotFoundException: org.sparkproject.connect.client.com.google.common.cache.CacheLoader
at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
... 11 more
I know the connect does a bunch of shading during assembly so it could be related to that. This application is not started via spark-submit or anything. It's not run neither under a SPARK_HOME
( I guess that's the whole point of connect client )
I followed the doc exactly as described. Can somebody help?
This is definitely an issue with shading, this was probably introduced in the recent dependency rework. My apologies for the poor experience. I have filed https://issues.apache.org/jira/browse/SPARK-45371 to track this on our end. I will keep you posted.