I just started using openNLP to recognize names. I am using the model (en-ner-person.bin) that comes with open NLP. I noticed that while it recognizes us, uk, and european names, it fails to recognize Indian or Japanese names. My questions are (1) is there already models available that I can use to recognize foreign names (2) If not, then I believe I will need to generate new models. In that case, is there a copora available that I can use?
OpenNLP: foreign names does not get recognized
4.6k views Asked by Shirish Kumar At
1
There are 1 answers
Related Questions in NLP
- Seeking Python Libraries for Removing Extraneous Characters and Spaces in Text
- Clarification on T5 Model Pre-training Objective and Denoising Process
- The training accuracy and the validation accuracy curves are almost parallel to each other. Is the model overfitting?
- Give Bert an input and ask him to predict. In this input, can Bert apply the first word prediction result to all subsequent predictions?
- Output of Cosine Similarity is not as expected
- Getting an error while using the open ai api to summarize news atricles
- SpanRuler on Retokenized tokens links back to original token text, not the token text with a split (space) introduced
- Should I use beam search on validation phase?
- Dialogflow failing to dectect the correct intent
- How to detect if two sentences are simmilar, not in meaning, but in syllables/words?
- Is BertForSequenceClassification using the CLS vector?
- Issue with memory when using spacy_universal_sentence_encoder for similarity detection
- Why does the Cloud Natural Language Model API return so many NULLs?
- Is there any OCR or technique that can recognize/identify radio buttons printed out in the form of pdf document?
- Model, lexicon to do fine grained emotions analysis on text in r
Related Questions in OPENNLP
- Why does OpenNLP CLI output "SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder" on Windows?
- Name Entity recognition using java
- "Invokedynamic Error when Running OpenNLP on Android (Min SDK 13)"
- How to assign multiple tags to a token using OpenNLP?
- OpenNLP: Class file has wrong version 55.0, should be 52.0
- Why are the NER NamedEntityParser not appearing in my list of available parsers in Tika (2.8.0)
- Sentence detection with Apache OpenNLP - removing headers, unterminated sentences etc
- How to import any Natural Language Processing Library for reference within my Unity project?
- What is the better and more precise way to train a Name Finder model in OpenNLP, NameFinderME or TokenNameFinderTrainer?
- GCP Vertex AI - Insight from Text Data
- How to get opennlp plugin for pycharm
- How to create a simple Italian Model for a Named Entity Extraction of Persons using OpenNLP?
- How can I exract a full sentence using Apache NLPCraft?
- Using for loop to search through string and create data frame
- sprintf("%s%s") returning 'character(0)' instead of string when combining two lists
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
You can make your own model with your data using an opennlp addon called modelbuilder-addon, if you try it you may be the first one to do so other than me...it's brand new.
it is very new, but it works for me.
You feed it the following:
you can checkout the addon here
https://svn.apache.org/repos/asf/opennlp/addons/modelbuilder-addon
you can use this to get started
the idea is that your known entities (common names in your data) are used to create annotations, and those annotations are used to generate a model, then the model is used to generate more names and annotations etc... the tool will do this as per the "iterations" parameter. You should run it, check your results, any undesirable hits should be added to the blacklist file, and then you can run the training again. I've used this and got pretty good results. If you find problems with it, put in a ticket at OpenNLP.