How to add Delta Lake support to Zeppelin's spark interpreter?

985 views Asked by At

I'm trying to add the Delta Lake support to Zeppelin.

So far I've tried adding the io.delta:delta-core_2.12:0.7.0 dependency to the spark interpreter, as well as a couple other related actions within the interpreters view... but nothing has worked.

When I add the io.delta:delta-core_2.12:0.7.0 dependency, I get errors within my notebooks such as:

org.apache.zeppelin.interpreter.InterpreterException: java.lang.NoSuchMethodError: scala.Predef$.refArrayOps([Ljava/lang/Object;)Lscala/collection/mutable/ArrayOps;
    at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:76)
    at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:668)
    at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:577)
    at org.apache.zeppelin.scheduler.Job.run(Job.java:172)
    at org.apache.zeppelin.scheduler.AbstractScheduler.runJob(AbstractScheduler.java:130)
    at org.apache.zeppelin.scheduler.FIFOScheduler.lambda$runJobInScheduler$0(FIFOScheduler.java:39)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NoSuchMethodError: scala.Predef$.refArrayOps([Ljava/lang/Object;)Lscala/collection/mutable/ArrayOps;
    at org.apache.spark.util.Utils$.stringToSeq(Utils.scala:2664)
    at org.apache.spark.internal.config.ConfigHelpers$.stringToSeq(ConfigBuilder.scala:49)
    at org.apache.spark.internal.config.TypedConfigBuilder$$anonfun$toSequence$1.apply(ConfigBuilder.scala:125)
    at org.apache.spark.internal.config.TypedConfigBuilder$$anonfun$toSequence$1.apply(ConfigBuilder.scala:125)
    at org.apache.spark.internal.config.TypedConfigBuilder.createWithDefault(ConfigBuilder.scala:143)
    at org.apache.spark.internal.config.package$.<init>(package.scala:172)
    at org.apache.spark.internal.config.package$.<clinit>(package.scala)
    at org.apache.spark.SparkConf$.<init>(SparkConf.scala:716)
    at org.apache.spark.SparkConf$.<clinit>(SparkConf.scala)
    at org.apache.spark.SparkConf.set(SparkConf.scala:95)
    at org.apache.spark.SparkConf$$anonfun$loadFromSystemProperties$3.apply(SparkConf.scala:77)
    at org.apache.spark.SparkConf$$anonfun$loadFromSystemProperties$3.apply(SparkConf.scala:76)
    at scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:877)
    at scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:234)
    at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:468)
    at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:468)
    at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:876)
    at org.apache.spark.SparkConf.loadFromSystemProperties(SparkConf.scala:76)
    at org.apache.spark.SparkConf.<init>(SparkConf.scala:71)
    at org.apache.spark.SparkConf.<init>(SparkConf.scala:58)
    at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:80)
    at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70)
    ... 8 more

My goal is to read/write from/to Delta Lake tables using Scala + Spark.

Thanks!

1

There are 1 answers

0
Alex Ott On BEST ANSWER

The most probable reason for this is that you're using Delta Lake with Spark 2.x - the package that you're using is supposed to work with Spark 3.0+ (compiled with Scala 2.12). The latest version of Delta that supports 2.4 (minimum 2.4.2) is 0.6.1 (see this answer).

So you need to upgrade Spark version if you want to use this specific package, or use another version of Delta if you want to keep you Spark installations.