My list contains some words like : [‘orange’, ‘cool’, ‘app’....]
and I want to output all these exact whole words (if available) from a description column in a DataFrame.
I have also attached a sample picture with code. I used str.findall()
The picture shows, it extracts add
from additional
, app
from apple
. However, I do not want that. It should only output if it matches the whole word.
You can fix the code using
Or, if there can be special chars in your
list1
words,The pattern created by
fr"\b({'|'.join(list1)})\b"
andfr"(?<!\w)({'|'.join(map(re.escape, list1))})(?!\w)"
will look likeSee the regex demo. Note
.str.join(", ")
is considered faster than.apply(", ".join)
.