Good evening,
I just started using Python for a project (I want to use social media data from different platforms to then proceed to an analysis) and I need to retrieve from Weibo different tweet data.
I chose to use this library for this work. Taking the website example my code is the following:
from weibo_scraper import get_weibo_tweets_by_name
for tweet in get_weibo_tweets_by_name(name='嘻红豆'):
print(tweet)
Result looks like this:
{'card_type': 9, 'itemid': '1076033637346297_-_4341063131108312', 'scheme': 'https://m.weibo.cn/status/HheeR4Ek0?mblogid=HheeR4Ek0&luicode=10000011&lfid=1076033637346297', 'mblog': {'created_at': '12小时前', 'id': '4341063131108312', 'idstr': '4341063131108312', 'mid': '4341063131108312', 'can_edit': False, 'show_additional_indication': 0, 'text': '行吧//<a href=\'/n/夏正正\'>@夏正正</a>:我没有,我没说过。<span class="url-icon"><img alt=[感冒] src="//h5.sinaimg.cn/m/emoticon/icon/default/d_ganmao-babf39d6ae.png" style="width:1em; height:1em;" /></span>
I'm not sure if the other way of retrieving tweet makes it easier to transform it into a dataframe but here the other way of doing it:
from weibo_scraper import get_formatted_weibo_tweets_by_name
result_iterator = get_formatted_weibo_tweets_by_name(name='嘻红豆', pages=None)
for user_meta in result_iterator:
for tweetMeta in user_meta.cards_node:
print(tweetMeta.mblog.text)
With the following result:
行吧//<a href='/n/夏正正'>@夏正正</a>:我没有,我没说过。<span class="url-icon"><img alt=[感冒] src="//h5.sinaimg.cn/m/emoticon/icon/default/d_ganmao-babf39d6ae.png" style="width:1em; height:1em;" /></span>//<a href='/n/勺布斯'>@勺布斯</a>:<span class="url-icon"><img alt=[二哈] src="//h5.sinaimg.cn/m/emoticon/icon/others/d_erha-0d2bea3a7d.png" style="width:1em; height:1em;" /></span>//<a href='/n/暴躁豆奶包'>@暴躁豆奶包</a>:逃避虽然舒服但没用//<a href='/n/by语冰'>@by语冰</a>: 难受//<a href='/n/-Lillyyyyyy-'>@-Lillyyyyyy-</a>:扎心
From here, I'm not sure how I should proceed to transform the data into a pandas dataframe (creating a CSV?, transform the data directly?).
I would like to have some guidance on this if possible.
Thank you very much for reading.
While its hard for me to grasp exactly what you are looking to achieve, I think this should get you started in a dataframe. You can start earlier by adding the tweet itself to the list then use the pd.DataFrame(tweets) to create a datafrmae then expanding and extracting from there or you can do the below.