Adapting StanfordCoreNLP to process noisy web text?

Question

Adapting StanfordCoreNLP to process noisy web text?

149 views Asked by Jess At 06 December 2013 at 02:43

I've been trying out the StanfordCoreNLP NER and everything manually on the website, and it seems they depend on very specific/proper English cues to detect entities, for example. When dealing with web text, though, where you could have some text like

John Doe

Assistant Professor of Computer Science

Stanford University

StanfordNLP seems to have some trouble (labeling the whole thing as one organization due to lack of prepositions/punctuation). Is there anything I can do to allow NER to better handle this kind of text (e.g. program some pre-processing of text)?

Original Q&A

There are 1 answers

**Vanaja Jayaraman** · Answer 1 · 2014-07-18T05:03:48+00:00

Vanaja Jayaraman On 18 July 2014 at 05:03

Adding dot(.) at the end of each line gives better results. (Since sentence splitter uses dot as delimeter)

TechQA.

Adapting StanfordCoreNLP to process noisy web text?

There are 1 answers

Related Questions in JAVA

Related Questions in NLP

Related Questions in STANFORD-NLP

Related Questions in NAMED-ENTITY-RECOGNITION

Related Questions in NAMED-ENTITY-EXTRACTION

Popular Questions

Popular Tags

Trending Questions