Spark SQL: TwitterUtils Streaming fails for unknown reason

2.8k views Asked by At

I am using the latest Spark master and additionally, I am loading these jars: - spark-streaming-twitter_2.10-1.1.0-SNAPSHOT.jar - twitter4j-core-4.0.2.jar - twitter4j-stream-4.0.2.jar

My simple test program that I execute in the shell looks as follows:

import org.apache.spark.streaming._
import org.apache.spark.streaming.twitter._
import org.apache.spark.streaming.StreamingContext._

System.setProperty("twitter4j.oauth.consumerKey", "jXgXF...")
System.setProperty("twitter4j.oauth.consumerSecret", "mWPvQRl1....")
System.setProperty("twitter4j.oauth.accessToken", "26176....")
System.setProperty("twitter4j.oauth.accessTokenSecret", "J8Fcosm4...")

var ssc = new StreamingContext(sc, Seconds(1))
var tweets = TwitterUtils.createStream(ssc, None)
var statuses = tweets.map(_.getText)
statuses.print()

ssc.start()

However, I won't get any tweets. The main error I see is

14/08/04 10:52:35 ERROR scheduler.ReceiverTracker: Deregistered receiver for stream 0: Error starting receiver 0 - java.lang.NoSuchMethodError: twitter4j.TwitterStream.addListener(Ltwitter4j/StatusListener;)V
    at org.apache.spark.streaming.twitter.TwitterReceiver.onStart(TwitterInputDStream.scala:72)
    ....

And then for each iteration:

INFO scheduler.ReceiverTracker: Stream 0 received 0 blocks

I'm not sure where the problem lies. How can I verify that my twitter credentials are correctly recognized? Might there be another jar missing?

2

There are 2 answers

3
Sean Owen On BEST ANSWER

NoSuchMethodError should always cause you to ask whether you are running with the same versions of libraries and classes that you compiled with.

If you look at the pom.xml file for the Spark examples module, you'll see that it uses twitter4j 3.0.3. You're bringing incompatible 4.0.2 with you at runtime and that breaks it.

0
Kehe CAI On

Yes, Sean Owen has given the good reason, after I add two dependency files on the pom.xml file:

<dependency>
    <groupId>org.twitter4j</groupId>
    <artifactId>twitter4j-core</artifactId>
    <version>3.0.6</version>
</dependency>
<dependency>
    <groupId>org.twitter4j</groupId>
    <artifactId>twitter4j-stream</artifactId>
    <version>3.0.6</version>
</dependency>

In this way we change the default twitter4j version from 4.0.x to 3.0.x (http://mvnrepository.com/artifact/org.twitter4j/twitter4j-core), then the incompatible problem will be solved.