How can I improve text recognition accuracy with jTessBoxEditor?

Question

How can I improve text recognition accuracy with jTessBoxEditor?

410 views Asked by Jeff At 31 August 2020 at 06:26

I have been trying to extract data from scanned pdf documents. I have converted the pdf file into jpeg file (I have attached the image link below), cropped the words and numbers with different fonts, merged into a tiff file and trained the fonts using jTessBoxEditor to generate a new language and I used that language in Tesseract-OCR to extract the data from the file. But I couldn't extract the exact data. The text recognition accuracy of tesseract-ocr is very poor.

Can someone suggest a method to improve the accuracy?

Image from which I have been trying to extract the data

Original Q&A

There are 1 answers

**Ahx** · Answer 1 · 2020-08-31T09:51:21+00:00

I couldn't extract the exact data

Did you use image_to_string method?

text = pytesseract.image_to_string(image)

Part of Output:

Prepared By: Pat Schelbel Inst No:95018369 Date:12/27/1995
Post Office Address: + Barnett Bank of Volusia County Doc Stamp—Mort : 525.00
Intang. Tax : 320. 0G
SYD CROSBY, FLAGLER County
LOANNO, — 288155 By: 1, Dtpaneaad D.C. Time:16243:
[Space Above This Line For Recording Data]
MORTGAGE
THIS MORTGAGE (Seeuri Instrument’) Is givenon DECEMBER 20, 1995 . The mortgagoris
JOYCE M. PETERSON, ASTRUSTEE AND INDIVIDUALLY AND ROGER E. PETERSON, HER HUSBAND
of the Trust Agreement dated January 25, 1995.
{"Borrower").
This Security Instrument is given to Barnett Bank of Volusia County , care of Barnett
Mortgage Company, P.O. Box 40843, Jacksonville, Florida 32203-0843, which Is organized and existing under the laws of
the State of Florida ,and whose address is. 1825 Business Park Blvd, Daytona Beach, FL 32114
Borrower owes Lender the principal sum of One Hundred Fifty Thousand Dollars and no/100
Dollars (US.$ 150,000.00 ). This debtis
evidenced by Borrower's note dated the same date as this Security Instrument ("Note"), which provides for monthly payments, with the full debt, if
not paid earlier, due and payable on January 1, 2026 . This Security Instrument secures to Lender: (a) the repayment of the dabt
evidenced by the Note, with interest, and all renewals, extensions and modifications of the Note; (b) the payment of all other sums, with interest,
advanced under paragraph 7 to protect the security of this Security instrument; and (c) the performance of Borrower's covenants and agreements
under this Security Instrument and the Note. For this purpose, Borrower does hereby mortgage, grant and convey to Lender the following
descrived property locatedin Flagler County, Florida:
.

Updated: Inst No is extracted using pytesseract

TechQA.

How can I improve text recognition accuracy with jTessBoxEditor?

There are 1 answers

Related Questions in PYTHON

Related Questions in OCR

Related Questions in TESSERACT

Related Questions in PYTHON-TESSERACT

Related Questions in PDF-EXTRACTION

Popular Questions

Popular Tags

Trending Questions