I might be barking up the wrong tree: is it possible to use the Stanford-nlp Named Entity Recognition application to parse names that have wildly-different formats?
I'm using the NuGet version of said app in a c# application, as described in this link: http://sergey-tihon.github.io/Stanford.NLP.NET/StanfordNER.html, and the english.all.3class.distsim.crf.ser.gz model.
The Name field that I need to parse might contain any of these values:
- Joe Jones
- Joe Jones, Jane Jones
- Joe & Jane Jones
- Joe and Jane Jones
- Jones, Joe
- Jones, Joe and Jane
Using the english.all.3class.distsim.crf.ser.gz model, Stanford-nlp NER handles the first two fine, putting them correctly into their own PERSON nodes, but not the rest.
Is there a model file out there I could use that would do a more thorough job of parsing names?