Python extracting sentence containing 2 words

Question

Python extracting sentence containing 2 words

6.5k views Asked by Marcelo At 30 August 2013 at 09:11

I have the same problem that was discussed in this link Python extract sentence containing word, but the difference is that I want to find 2 words in the same sentence. I need to extract sentences from a corpus, which contains 2 specific words. Does anyone could help me, please?

Original Q&A

There are 3 answers

moliware On 30 August 2013 at 10:13

I think you want an answer using nltk. And I guess that those 2 words don't need to be consecutive right?

>>> from nltk.tokenize import sent_tokenize, word_tokenize
>>> text = 'I like to eat apple. Me too. Let's go buy some apples.'
>>> words = ['like', 'apple']
>>> sentences = sent_tokenize(text)
>>> for sentence in sentences:
...   if (all(map(lambda word: word in sentence, words))):
...      print sentence
...
I like to eat apple.

Steve L On 30 August 2013 at 20:56

This would be simple using the TextBlob package together with Python's builtin sets.

Basically, iterate through the sentences of your text, and check if their exists an intersection between the set of words in the sentence and your search words.

from text.blob import TextBlob

search_words = set(["buy", "apples"])
blob = TextBlob("I like to eat apple. Me too. Let's go buy some apples.")
matches = []
for sentence in blob.sentences:
    words = set(sentence.words)
    if search_words & words:  # intersection
        matches.append(str(sentence))
print(matches)
# ["Let's go buy some apples."]

Update: Or, more Pythonically,

from text.blob import TextBlob

search_words = set(["buy", "apples"])
blob = TextBlob("I like to eat apple. Me too. Let's go buy some apples.")
matches = [str(s) for s in blob.sentences if search_words & set(s.words)]
print(matches)
# ["Let's go buy some apples."]

**badc0re** · Accepted Answer · 2013-08-30T09:17:00+00:00

If this is what you mean:

import re
txt="I like to eat apple. Me too. Let's go buy some apples."
define_words = 'some apple'
print re.findall(r"([^.]*?%s[^.]*\.)" % define_words,txt)  

Output: [" Let's go buy some apples."]

You can also try with:

define_words = raw_input("Enter string: ")

Check if the sentence contain the defined words:

import re
txt="I like to eat apple. Me too. Let's go buy some apples."
words = 'go apples'.split(' ')

sentences = re.findall(r"([^.]*\.)" ,txt)  
for sentence in sentences:
    if all(word in sentence for word in words):
        print sentence

TechQA.

Python extracting sentence containing 2 words

There are 3 answers

Related Questions in PYTHON

Related Questions in REGEX

Related Questions in NLTK

Related Questions in SENTENCE

Related Questions in TEXT-SEGMENTATION

Popular Questions

Trending Questions