Python how to strip a string from a string based on items in a list

634 views Asked by At

I have a list as shown below:

exclude = ["please", "hi", "team"]

I have a string as follows:

text = "Hi team, please help me out."

I want my string to look as:

text = ", help me out."

effectively stripping out any word that might appear in the list exclude

I tried the below:

if any(e in text.lower()) for e in exclude:
         print text.lower().strip(e)

But the above if statement returns a boolean value and hence I get the below error:

NameError: name 'e' is not defined

How do I get this done?

5

There are 5 answers

0
Ashwini Chaudhary On BEST ANSWER

Something like this?

>>> from string import punctuation
>>> ' '.join(x for x in (word.strip(punctuation) for word in text.split())
                                                   if x.lower() not in exclude)
'help me out

If you want to keep the trailing/leading punctuation with the words that are not present in exclude:

>>> ' '.join(word for word in text.split()
                             if word.strip(punctuation).lower() not in exclude)
'help me out.'

First one is equivalent to:

>>> out = []
>>> for word in text.split():
        word = word.strip(punctuation)
        if word.lower() not in exclude:
            out.append(word)
>>> ' '.join(out)
'help me out'
2
Hackaholic On

if you are not worried about punctuation:

>>> import re
>>> text = "Hi team, please help me out."
>>> text = re.findall("\w+",text)
>>> text
['Hi', 'team', 'please', 'help', 'me', 'out']
>>> " ".join(x for x in text if x.lower() not in exclude)
'help me out'

In the above code, re.findall will find all words and put them in a list.
\w matches A-Za-z0-9
+ means one or more occurrence

0
Ashwani On

You can use Use this (remember it is case sensitive)

for word in exclude:
    text = text.replace(word, "")
0
Lord Henry Wotton On

This is going to replace with spaces everything that is not alphanumeric or belong to the stopwords list, and then split the result into the words you want to keep. Finally, the list is joined into a string where words are spaced. Note: case sensitive.

' '.join ( re.sub('\W|'+'|'.join(stopwords),' ',sentence).split() )

Example usage:

>>> import re
>>> stopwords=['please','hi','team']
>>> sentence='hi team, please help me out.'
>>> ' '.join ( re.sub('\W|'+'|'.join(stopwords),' ',sentence).split() )
'help me out'
0
The6thSense On

Using simple methods:

import re
exclude = ["please", "hi", "team"]
text = "Hi team, please help me out."
l=[]

te = re.findall("[\w]*",text)
for a in te:
    b=''.join(a)
    if (b.upper() not in (name.upper() for name in exclude)and a):
        l.append(b)
print " ".join(l)

Hope it helps