Python: Porterstemmer on a list of sentences

130 views Asked by At

I have the following code (a couple sentences of all the data given): " ['Came here for the witches, stayed for the Gwent.\n',

"Great horror game, can get a little repetitive so a new fresh set of maps is ideal.The level up system is trash, i've never been so confused between 2 addons, offerings, and perk system.Major downfall is the match-making system, you can be thrown against a level 39 killer with 3 of the 4 survivors a level 1.\n",

'its very good\n',

'Amazing!!!!!!!!!!!!\n',

"If you are looking for a fun game to play this game is the game for you. Constant laughs and funny references combined with action packed game play and a open world full of nooks and crannies for you to explore. All this and more makes this game one of my favourites.This game has created many memorable moments and is worth more then it's extremely low price.The verdict - 9/10\n", 'best 3rd person survival game in my opinion especially with the dlcs.\n', 'GOOD HUNGRY JACK GAME\n']

each '\n' is the end of each line in the data.

Thus if I call data[1] I get the following:

"Great horror game, can get a little repetitive so a new fresh set of maps is ideal.The level up system is trash, i've never been so confused between 2 addons, offerings, and perk system.Major downfall is the match-making system, you can be thrown against a level 39 killer with 3 of the 4 survivors a level 1.\n",

Help part:

I can tokenize each string individually, and then use porterstemmer. How can I do this for all sentences at once? I did a for loop to tokenize every sentence in the list data. Would I have to make a for loop to stem every sentence too? if so how can I do it?

for a single string I used

ps = PorterStemmer()

word_after_stem = []
for w in text:
    word_after_stem.append(ps.stem(w))
1

There are 1 answers

4
Shahid Khan On

I hope this answer will help you.

https://www.guru99.com/stemming-lemmatization-python-nltk.html#:~:text=Stemming%20is%20a%20method%20of%20normalization%20of%20words,according%20to%20the%20context%20or%20sentence%20are%20normalized.

You can also do this if you have array of sentences:

for sent in data:
  for w in sent:
    word_after_stem.append(ps.stem(w))

Hope it will help you.