I have looked everywhere, but I can's seem to find a simple way of getting a sub string in python.

I am using tweepy, and I have stored a tweepy tweet into an array of textblobs, and set that textblob into a string variable.

Example: "RT @Acosta: Trump defends his “very fine people” comments on Charlottesville: “People were there protesting the taking down of the monument…"

This is a tweet, I want the "@Acosta"(or Acosta) part, How would I sub-string that part?

I tried using the re library, and while it worked like it should on any other string it did not work on the tweet


match = re.search("\@(.*?)\:" , randTweet).group(1)

2 Answers

0
zedfoxus On Best Solutions

You could do something like this with split:

>>> test = '"RT @Acosta: Trump defends his “very fine people” comments on Charlottesville: “People were there protesting the taking down of the monument…"'
>>> mention = test.split('@')
>>> mention
['"RT ', 'Acosta: Trump defends his “very fine people” comments on Charlottesville: “People were there protesting the taking down of the monument…"']
>>> person = mention[1].split(':')
>>> person
['Acosta', ' Trump defends his “very fine people” comments on Charlottesville', ' “People were there protesting the taking down of the monument…"']
>>> person[0]
'Acosta'

Putting it together:

>>> person = test.split('@')[1].split(':')[0]
>>> person
'Acosta'

Python script

test = '"RT @Acosta: Trump defends his “very fine people” comments on Charlottesville: “People were there protesting the taking down of the monument…"'

mention = test.split('@')
person = mention[1].split(':')

print(person[0])

You should put some error checks to confirm that you found a mention or not before splitting mention.

0
Patrick Artner On

Unable to replicate your problem. After fixing

SyntaxError: Non-ASCII character '\xe2' in file main.py on line 3, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details

due to your data including “ and ” and … it works:

randTweet = """RT @Acosta: Trump defends his "very fine people" comments on Charlottesville: "People were there protesting the taking down of the monument..." """

import re

match = re.search("\@(.*?)\:" , randTweet).group(1)
print(match) # Acosta