How to configure Apache Flume to fetch data from Twitter for specific period?

6.4k views Asked by At

I have a hadoop cluster and apache flume for data integration from twitter to HDFS, it by default fetches data by chronological order like most recent tweet will be fetched first and likewise, and now I have usecase to fetch specific data from twitter for specific period, say month Feb 2013. Kindly let me know is there any configuration or property in flume or Twitter Handle need to be set.

Thanks in advance.

1

There are 1 answers

0
vishnu viswanath On

You might want to use customized source for flume.

http://blog.cloudera.com/blog/2012/10/analyzing-twitter-data-with-hadoop-part-2-gathering-data-with-flume/

The TwitterSource mentioned in the above link will help you fetch twitter data based on keyword.