Limit tweepy stream to a specific number

3.7k views Asked by At
class listener(StreamListener):

def on_status(self, status):
    try:
        userid = status.user.id_str
        geo = str(status.coordinates)
        if geo != "None":
            print(userid + ',' + geo)
        else:
            print("No coordinates")
        return True
    except BaseException as e:
        print('failed on_status,',str(e))
        time.sleep(5)

def on_error(self, status):
    print(status)


auth = OAuthHandler(ckey, csecret)
auth.set_access_token(atoken, asecret)

twitterStream = Stream(auth, listener())
twitterStream.filter(locations=[-97.54,32.55,-97.03,33.04])

I have this script for my tweepy stream, and it works perfectly. However, it keeps going until I terminate it using 'ctrl+c'. I tried adding a counter to "on_status" but it does not increment:

 class listener(StreamListener):

def on_status(self, status):
    i = 0
    while i < 10:
        userid = status.user.id_str
        geo = str(status.coordinates)
        if geo != "None":
            print(userid + ',' + geo)
            i += 1

No matter where I put the increment, it repeats itself. If I add "i=0" before the class I get an error:

RuntimeError: No active exception to reraise

Any idea how I can make the counter to work with streaming? The Cursor that comes with tweepy does not work with streaming, as far as I know at least.

1

There are 1 answers

5
ZdaR On BEST ANSWER

Your while logic is not working properly because Tweepy internally calls the on_status() method whenever it receives data. So you can't control the flow of by introducing a conditional inside an already running infinite loop, The best way is to create a new variable inside the class, which gets instantiated when the listener object is created. And increment that variable inside the on_data() method.

class listener(StreamListener):

    def __init__(self):
        super().__init__()
        self.counter = 0
        self.limit = 10

    def on_status(self, status):
        try:
            userid = status.user.id_str
            geo = str(status.coordinates)
            if geo != "None":
                print(userid + ',' + geo)
            else:
                print("No coordinates")
            self.counter += 1
            if self.counter < self.limit:
                return True
            else:
                twitterStream.disconnect()
        except BaseException as e:
            print('failed on_status,',str(e))
            time.sleep(5)