I am using Tesseract in my android application. I defined my "user-words" file and I added the bold line for ocr to consider user-words file.
String language = "deu";
datapath = getFilesDir()+ "/tesseract/";
Tess = new TessBaseAPI();
checkFile(new File(datapath + "tessdata/"));
**Tess.setVariable("user_words_suffix","deu.user-words");**
Tess.init(datapath, language);
I did not define an user-patterns file , since there is not any specific pattern in my images. I just copy the UTF-8 txt file of due.user-words in the tessdata folder. Is this enough for ocr configuration ? or Should I unpack due_traindata and add this file to due_traindata and then pack it? if yes can you give me some hint on how to do that.
You don't need to specify the language prefix in the code:
Tess.setVariable("user_words_suffix", "user-words");
Make sure the file's prefix matches the specified language code -- namely,
deu.user-words
.https://github.com/tesseract-ocr/tesseract/blob/master/doc/tesseract.1.asc https://github.com/tesseract-ocr/tesseract/wiki/ControlParams