Unable to start Spark application with Bahir

175 views Asked by At

I am trying to run a Spark application in Scala to connect to ActiveMQ. I am using Bahir for this purpose format("org.apache.bahir.sql.streaming.mqtt.MQTTStreamSourceProvider"). When I am using Bahir2.2 in my built.sbt the application is running fine but on changing it to Bahir3.0 or Bahir4.0 the application is not starting and it is giving an error:

[error] (run-main-0) java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream`

How to fix this? Is there an alternative of Bahir which I can use in my Spark-Structured-Streaming to connect to ActiveMQ topics?

EDIT: my build.sbt

//For spark
libraryDependencies ++= Seq(
    "org.apache.spark" %% "spark-core" % "2.4.0" ,
    "org.apache.spark" %% "spark-mllib" % "2.4.0" ,
    "org.apache.spark" %% "spark-sql" % "2.4.0" ,
    "org.apache.spark" %% "spark-hive" % "2.4.0" ,
    "org.apache.spark" %% "spark-streaming" % "2.4.0" ,
    "org.apache.spark" %% "spark-graphx" % "2.4.0",
)

//Bahir
libraryDependencies += "org.apache.bahir" %% "spark-sql-streaming-mqtt" % "2.4.0"
1

There are 1 answers

0
WinterSoldier On BEST ANSWER

Okay, So it seems some kind of compatibility issue between spark2.4 and bahir2.4. I fixed it by rolling back both of them to ver 2.3.

Here is my build.sbt

name := "sparkTest"

version := "0.1"

scalaVersion := "2.11.11"

//For spark
libraryDependencies ++= Seq(
    "org.apache.spark" %% "spark-core" % "2.3.0" ,
    "org.apache.spark" %% "spark-mllib" % "2.3.0" ,
    "org.apache.spark" %% "spark-sql" % "2.3.0" ,
    "org.apache.spark" %% "spark-hive" % "2.3.0" ,
    "org.apache.spark" %% "spark-streaming" % "2.3.0" ,
    "org.apache.spark" %% "spark-graphx" % "2.3.0",
//    "org.apache.spark" %% "spark-streaming-kafka" % "1.6.3",
)

//Bahir
libraryDependencies += "org.apache.bahir" %% "spark-sql-streaming-mqtt" % "2.3.0"