Error running spark app using spark-cassandra connector

620 views Asked by At

I have written a basic spark app that reads and writes to Cassandra following this guide (https://github.com/datastax/spark-cassandra-connector/blob/master/doc/0_quick_start.md)

This is what the .sbt for this app looks like:

name := "test Project"

version := "1.0"

scalaVersion := "2.10.5"

libraryDependencies ++= Seq(
      "org.apache.spark" %% "spark-core" % "1.2.1",
      "com.google.guava" % "guava" % "14.0.1",
      "com.datastax.spark" %% "spark-cassandra-connector" % "1.2.1",
      "org.apache.cassandra" % "cassandra-thrift" % "2.0.14",
      "org.apache.cassandra" % "cassandra-clientutil" % "2.0.14",
      "com.datastax.cassandra" % "cassandra-driver-core"  % "2.0.14"
)

As you can see the Spark version is 1.2.1 (and not 1.3.1 like a lot of other questions) but when I run this app using spark-submit I still run into the error:

WARN TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1, abcdev26): java.lang.NoSuchMethodError: org.apache.spark.executor.TaskMetrics.inputMetrics_$eq(Lscala/Option;)V
        at com.datastax.spark.connector.metrics.InputMetricsUpdater$.apply(InputMetricsUpdater.scala:61)
        at com.datastax.spark.connector.rdd.CassandraTableScanRDD.compute(CassandraTableScanRDD.scala:196)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
        at org.apache.spark.scheduler.Task.run(Task.scala:64)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:722)

What am I missing? All the answered I've searched up so far suggest using 1.2.1 which I already am doing.

Any suggestions would be much appreciated!

1

There are 1 answers

0
maasg On BEST ANSWER

Are you 100% sure that you are running against Spark 1.2.1? Also on the executors?

The problem is that this metric accessor became private in Spark 1.3.0 and therefore cannot be found at runtime. See TaskMetrics.scala - Spark 1.2.2 vs TaskMetrics.scala - spark v1.3.0, so most probably there's a Spark1.3.x version somewhere.

Make sure the same version 1.2.x is on all executors as well.