I am getting the POS tagged text in R in the form of:
id type start end features
1 word 1 5 POS=NNP
2 word 7 8 POS=IN
.....
I want to retrieve the word that it has tagged for example instead of the column 'type' with all values as words retrieve the actual words. I can use scan_tokenizer, but problem comes in when there are forms like "isn't" the POS tagger breaks it into "is" and "not", which is great but the scan_tokenizer doesn't tokenize that way it just keeps it at "isn't". Can anyone please help me retrieve the word that R has tokenized and used to POS tag?
Thanks
Why don't you use Illinois POS tagger? It is easy to use and visualize:
http://cogcomp.cs.illinois.edu/page/software_view/3
http://cogcomp.cs.illinois.edu/demo/pos/?id=4