My project is to download extremly big number of ID-s from twitter. Also known as, that the average user have small number of followers(100-200). I use for this streaming the Twython package, and here is the main part of my program:
while(next_cursor):
follower_id=twitter.get_followers_ids(user_id=ids,cursor=next_cursor)
time.sleep(60)
next_cursor=follower_id['next_cursor']
This is a really simple cod, and works also, but really slow, for big number of ID-s, becouse the function tw.get_follower_id()-s rate limit is 5000 id/minute, thats why the time sleep function is in the code.
My question, is there any possibilites of speed up this code?
Perhaps so that the program does not pause after each query, only when it really need. Could somebody help with this?
Twitter provide rate-limit info in the headers sent with every API response. SO you could check that, and hence call at the maximum rate allowed. You can also request your rate-limit status from Twitter via a specific rate-limit API call, and it doesn't reduce the rate-limit to check. I don't use Twython myself, so I can't advise on how to do so within Twython.
It won't gain you much extra -- maybe a few %.
Alternatively, it doesn't hurt to bump into the rate-limit occasionally -- you'll get an error message. As long as it isn't too frequent, Twitter won't mind.
The basic rate-limit speed cap -- no way round that. Perhaps Gnip have a paid service that will let you download this data faster?