So I was analyzing a text corpus and I used stemmer for all the tokenized words.
But I also have to find all the nouns in the corpus so I again did a nltk.pos_tag(stemmed_sentence)
But my question is am I doing it right?
A.] tokenize->stem->pos_tagging
OR
B.] tokenize->stem #stemming and pos_tagging done seperately
tokeinze->pos_tagging
Ive followed method A, but Im confused as to its the right way to achieve pos_tagging.
Why don't you try it out?
Here's an example:
This is the outcome of tokenizing.
This is the outcome of tokenize -> stem
This is the outcome of tokenize -> stem -> POS tag
This is the outcome of tokenize -> POS tag
So what's the right way?