I was testing Stanford NLP POS Tagger, I am getting mixed results.
SOP(StanfordNLP.getInstance().getPOSMap("WHEAT flour(whole)".toLowerCase()));
SOP(StanfordNLP.getInstance().getPOSMap("Whole wheat flour".toLowerCase()));
Gives me the following output
{NN=[wheat, flour, whole]}
{JJ=[whole], NN=[wheat, flour]}
How can I deal with problems like these? Its actually the same words rearranged.
EDIT
Maybe, I should explain the problem.
I want to compare 2 sentences. My approach is perform POS on both string and then compare and score individually Nouns/Adjectives/Verbs from both strings.
But because of fuzzy tagging (as also reffered to by @Elliott) based on order of words, my ranking fails in some cases. Can someone suggest a workaround?
Is there a classification statistics which gives the probability of a Noun classified as Adjective or Verb etc, that i can use in my scoring algo to provide weights?
thanks Chahat
POS taggers always give mixed results; the POS tagging is contextual since a word can be a noun, adjective, or verb in different contexts. The AI component of POS tagging decides how to tag words based on their order in the sentence.