I'm trying to add the Delta Lake support to Zeppelin.
So far I've tried adding the io.delta:delta-core_2.12:0.7.0
dependency to the spark interpreter, as well as a couple other related actions within the interpreters view... but nothing has worked.
When I add the io.delta:delta-core_2.12:0.7.0
dependency, I get errors within my notebooks such as:
org.apache.zeppelin.interpreter.InterpreterException: java.lang.NoSuchMethodError: scala.Predef$.refArrayOps([Ljava/lang/Object;)Lscala/collection/mutable/ArrayOps;
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:76)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:668)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:577)
at org.apache.zeppelin.scheduler.Job.run(Job.java:172)
at org.apache.zeppelin.scheduler.AbstractScheduler.runJob(AbstractScheduler.java:130)
at org.apache.zeppelin.scheduler.FIFOScheduler.lambda$runJobInScheduler$0(FIFOScheduler.java:39)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NoSuchMethodError: scala.Predef$.refArrayOps([Ljava/lang/Object;)Lscala/collection/mutable/ArrayOps;
at org.apache.spark.util.Utils$.stringToSeq(Utils.scala:2664)
at org.apache.spark.internal.config.ConfigHelpers$.stringToSeq(ConfigBuilder.scala:49)
at org.apache.spark.internal.config.TypedConfigBuilder$$anonfun$toSequence$1.apply(ConfigBuilder.scala:125)
at org.apache.spark.internal.config.TypedConfigBuilder$$anonfun$toSequence$1.apply(ConfigBuilder.scala:125)
at org.apache.spark.internal.config.TypedConfigBuilder.createWithDefault(ConfigBuilder.scala:143)
at org.apache.spark.internal.config.package$.<init>(package.scala:172)
at org.apache.spark.internal.config.package$.<clinit>(package.scala)
at org.apache.spark.SparkConf$.<init>(SparkConf.scala:716)
at org.apache.spark.SparkConf$.<clinit>(SparkConf.scala)
at org.apache.spark.SparkConf.set(SparkConf.scala:95)
at org.apache.spark.SparkConf$$anonfun$loadFromSystemProperties$3.apply(SparkConf.scala:77)
at org.apache.spark.SparkConf$$anonfun$loadFromSystemProperties$3.apply(SparkConf.scala:76)
at scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:877)
at scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:234)
at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:468)
at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:468)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:876)
at org.apache.spark.SparkConf.loadFromSystemProperties(SparkConf.scala:76)
at org.apache.spark.SparkConf.<init>(SparkConf.scala:71)
at org.apache.spark.SparkConf.<init>(SparkConf.scala:58)
at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:80)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70)
... 8 more
My goal is to read/write from/to Delta Lake tables using Scala + Spark.
Thanks!
The most probable reason for this is that you're using Delta Lake with Spark 2.x - the package that you're using is supposed to work with Spark 3.0+ (compiled with Scala 2.12). The latest version of Delta that supports 2.4 (minimum 2.4.2) is 0.6.1 (see this answer).
So you need to upgrade Spark version if you want to use this specific package, or use another version of Delta if you want to keep you Spark installations.