Where can I find a list of class labels for pretrained SparkNLP NerDLModel?

368 views Asked by At

I have been searching for a while but no luck finding out what NER labels are included in the pretrained NerDL(tensorflow) model. I would think the training data can provide such information, but I do not see it mentioned in any documentation.

downloadable model: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/ner_precise_en_1.7.0_2_1539623388047.zip

Any direction would be appreciated!

UPDATE:

I indeed filed an issue in SparkNLP github following the advice here :) I just heard back from them. Here is the answer:

For practical purposes, the pretrained NER model has

B-ORG

I-ORG

B-PER

I-PER

B-LOC

I-LOC

and it has been trained from: https://raw.githubusercontent.com/patverga/torch-ner-nlp-from-scratch/master/data/conll2003/eng.train

See original issue here.

1

There are 1 answers

2
AlbertoAndreotti On

that model is trained on the CONLL2003 dataset for NER,

http://aclweb.org/anthology/W03-0419

That dataset basically has PERSON, ORGANIZATION, and LOCATION.

hope this helps, Alberto.