I have a dataframe of thousands of news articles that looks like this:
| id | text | date |
|---|---|---|
| 1 | newyorktimes leaders gather for the un summit in next week to discuss | 1980-1-18 |
| 2 | newyorktimes opinion section what the washingtonpost got wrong about | 1980-1-22 |
| 3 | a journalist for the washingtonpost went missing while on assignment | 1980-1-22 |
| 4 | washingtonpost president carter responds to criticisms on economic decline | 1980-1-28 |
| 5 | newyorktimes opinion section what needs to be down with about the rats | 1980-1-29 |
I want to produce an additional column that has the combined counts for several specific words in the articles themselves. Let's say I want to know how many times "newyorktimes", "washingtonpost", and "the" appear in each article. I would want a separate column added to the dataframe adding the counts for that row. Like this:
| id | text | date | wordlistcount |
|---|---|---|---|
| 1 | newyorktimes leaders gather for the un summit in next week to discuss | 1980-1-18 | 2 |
| 2 | newyorktimes opinion section what the washingtonpost and newyorktimes got wrong | 1980-1-22 | 4 |
| 3 | a journalist for the washingtonpost went missing while on assignment | 1980-1-22 | 2 |
| 4 | washingtonpost president carter responds to criticisms on economic decline | 1980-1-28 | 1 |
| 4 | newyorktimes opinion section what needs to be done with about the rats | 1980-1-29 | 2 |
How can I accomplish this? Any help would be greatly appreciated.
In
stringr, withstr_count: