So I'm working with Python and the Twitter API, using Tweepy and Twitter's Stream API, which returns Tweet objects in real-time. Part of my app which queries a different API doesn't play nice with URLS in the tweet text, so I'm using the Python re
module to replace them with a harmless identifier string. However, I'm having trouble finding the urls that need to be parsed out of the text. Instead of having to search through the text myself for URLS, I decided to use the ones that the API delivers and do a "find and replace" in the text.
Here is the documentation on what the API gives me. It gives a t.co url, a display url, and a fully expanded url. The problem with just using the t.co url is that twiter doesn't automatically convert all urls in tweets to t.co, only ones past a certain length. This means that the t.co url isn't always the same one that appears in the tweet text.
So I need to figure out how to get, from the API, the version of the URL which actually appears in the text of the tweet.
Thanks! evamvid
Try using this for the
extended_url
:That should you give you the link without the way you want it.