I have the same problem that was discussed in this link Python extract sentence containing word, but the difference is that I want to find 2 words in the same sentence. I need to extract sentences from a corpus, which contains 2 specific words. Does anyone could help me, please?
Python extracting sentence containing 2 words
6.4k views Asked by Marcelo At
3
There are 3 answers
0
On
This would be simple using the TextBlob package together with Python's builtin sets.
Basically, iterate through the sentences of your text, and check if their exists an intersection between the set of words in the sentence and your search words.
from text.blob import TextBlob
search_words = set(["buy", "apples"])
blob = TextBlob("I like to eat apple. Me too. Let's go buy some apples.")
matches = []
for sentence in blob.sentences:
words = set(sentence.words)
if search_words & words: # intersection
matches.append(str(sentence))
print(matches)
# ["Let's go buy some apples."]
Update: Or, more Pythonically,
from text.blob import TextBlob
search_words = set(["buy", "apples"])
blob = TextBlob("I like to eat apple. Me too. Let's go buy some apples.")
matches = [str(s) for s in blob.sentences if search_words & set(s.words)]
print(matches)
# ["Let's go buy some apples."]
0
On
I think you want an answer using nltk. And I guess that those 2 words don't need to be consecutive right?
>>> from nltk.tokenize import sent_tokenize, word_tokenize
>>> text = 'I like to eat apple. Me too. Let's go buy some apples.'
>>> words = ['like', 'apple']
>>> sentences = sent_tokenize(text)
>>> for sentence in sentences:
... if (all(map(lambda word: word in sentence, words))):
... print sentence
...
I like to eat apple.
If this is what you mean:
You can also try with:
Check if the sentence contain the defined words: