What are trained models in NLP?

617 views Asked by At

I am new to Natural language processing. Can anyone tell me what are the trained models in either OpenNLP or Stanford CoreNLP? While coding in java using apache openNLP package, we always have to include some trained models (found here http://opennlp.sourceforge.net/models-1.5/ ). What are they?

2

There are 2 answers

6
errantlinguist On BEST ANSWER

A "model" as downloadable for OpenNLP is a set of data representing a set of probability distributions used for predicting the structure you want (e.g. part-of-speech tags) from the input you supply (in the case of OpenNLP, typically text files).

Given that natural language is context-sensitive, this model is used in lieu of a rule-based system because it generally works better than the latter for a number of reasons which I won't expound here for the sake of brevity. For example, as you already mentioned, the token perfect could be either a verb (VB) or an adjective (JJ) and this can only be disambiguated in context:

  • This answer is perfect — for this example, the following sequences of POS tags are possible (in addition to many more):
    1. DT NN VBZ JJ
    2. DT NN VBZ VB

However, according to a model which accurately represents ("correct") English§, the probability of example 1 is greater than of example 2: P([DT, NN, VBZ, JJ] | ["This", "answer", "is", "perfect"]) > P([DT, NN, VBZ, VB] | ["This", "answer", "is", "perfect"])


In reality, this is quite contentious, but I stress here that I'm talking about natural language as a whole (including semantics/pragmatics/etc.) and not just about natural-language syntax, which (in the case of English, at least) is considered by some to be context-free.

When analyzing language in a data-driven manner, in fact any combination of POS tags is "possible", but, given a sample of "correct" contemporary English with little noise, tag assignments which native speakers would judge to be "wrong" should have an extremely low probability of occurrence.

§In practice, this means a model trained on a large, diverse corpus of (contemporary) English (or some other target domain you want to analyze) with appropriate tuning parameters (If I want to be even more precise, this footnote could easily be multiple paragraphs long).

0
Ganesh Krishnan On

Think of trained model as a "wise brain with existing information".

When you start out machine learning, the brain for your model is clean and empty. You can either download trained model or you can train your own model (like teaching a child)

Usually you only train models for edge cases else you download "Trained models" and get to work in predicting/machine learning.