Tesseract OCR: less memory consumption by avoiding new instances of TessBaseAPI?

174 views Asked by At

Basically I make a new instance of MyOCR whenever I need to perform OCR. This is currently what my constructor looks like:

public MyOCR(Bitmap bitmap)
    {
        this.tessBaseAPI = new TessBaseAPI();
        this.tessBaseAPI.setPageSegMode(TessBaseAPI.PageSegMode.PSM_AUTO_ONLY);
        try {
            this.tessBaseAPI.setDebug(true);
            this.tessBaseAPI.init("storage/emulated/0", "eng");
            this.tessBaseAPI.setImage(bitmap);
            this.text = tessBaseAPI.getUTF8Text();
            this.tessBaseAPI.end();
        } catch (Exception e) {
            e.printStackTrace();
            System.err.println(e.getMessage());
        }
    }

I was wondering if performance wise, in the long run, the following code be preferable. Basically I make only one instance of MyOCR and set the new image every time I need to perform OCR.

public MyOCR()
{
    this.tessBaseAPI = new TessBaseAPI();
    this.tessBaseAPI.setPageSegMode(TessBaseAPI.PageSegMode.PSM_AUTO_ONLY);
    try {
        this.tessBaseAPI.setDebug(true);
        this.tessBaseAPI.init("storage/emulated/0", "eng");
    } catch (Exception e) {
        e.printStackTrace();
        System.err.println(e.getMessage());
    }
}

public void ocr(Bitmap bitmap)
{
    this.tessBaseAPI.setImage(bitmap);
    this.text = tessBaseAPI.getUTF8Text();
}
0

There are 0 answers