I am a PhD student and now working on mining twitter data. I need to collect userids for realDonaldTrump's 88M followers. It will take about 15 days but the internet connection in my apartment is not very good, just like this:
7500000 followers!
Waiting about 15 minutes for rate limit reset...
Error in curl::curl_fetch_memory(url, handle = handle) :
Could not resolve host: api.twitter.com
and all the 7500000 followers I already downloaded disappear. So could you please give me some suggestions about:
- saveing the followers before running next round.
- or make rtweet check the internet connection before making decisions to continue or wait.
Thanks so much. WZ
In fact I'm doing research about polarization so I need to classify every user into categories of republicans or democrats, it depends on how many republican (democrate) accounts the user follows. Because getting followees is very time-consuming, so I have to get all followers of each politicans,and check how many times a user appears in republican/democrate politicans' follower sets, which makes the vector very large. This is the only solution I can think of, even though very stupid. I just use one line of code:
TrumpFollowers<-get_followers("realDonaldTrump",n=90000000,retryonratelimit = TRUE)