Arabic number recognization

2.5k views Asked by At

I'am trying to detect Arabic numbers (arabic - indic) from an image.

Tried Tesseract OCR it did not worked for me (it does recognize Arabic words but not numbers) here is the image i would like to extract the page number from it (top of the page)

enter image description here

I tried imagemagick to compare that image with smaller already made small images that contain all the book numbers possibility's, but it did not worked also, and i think it will take so much time.

what would an Practical non complex solution be ? PS: the picture will be from android phones and will be parsed on an windows or linux server.

1

There are 1 answers

0
myex On

Actually, Tesseract is not a valid solution to your problem nor any commercial Arabic OCR. You need to have a custom OCR solution that you can train on your samples and specify your special handling rules.

You still able to use Tesseract but in the form of its source code and training tools to build a custom solution by yourself. To customize Tesseract for Arabic, you may find this link is helpful http://arabicocr.wordpress.com