R package Twitter to analyze tweets text

1k views Asked by At

I'm using TwitteR package (specifically, the searchTwitter function) to export in a csv format all the tweets containing a specific hashtag.

I would like to analyze their text and discover how many of them contain a specific list of words that I have just saved in a file called importantwords.txt.

How can I create a function that could return me a score of how many tweets contain the words that I have written in my file importantwords.txt?

2

There are 2 answers

0
zebrainatree On

Pseudocode:

> for (every word in importantwords.txt):
>     int i = 0;
>     for (every line in tweets.csv):
>         if (line contains(word)):
>             i = i+1
>     print(word: i)

Is that along the lines of what you wanted?

0
Nathan Hatch On

I think best bet is to use the tm package.

http://cran.r-project.org/web/packages/tm/index.html

This fella uses it to create Word Clouds with the information. Looking through his code will probably help you out too.

http://davetang.org/muse/2013/04/06/using-the-r_twitter-package/

If your important words is just to avoid "the" "a" and things like that this will work fine. If its for something in particular you'll need to loop over the corpus with your list of words retrieving the counts.

Hope it helps Nathan