I am working on an OCR project whose goal is to read the stamped-in serial number off of a metal plate:
I am using OpenCV to prepare the image for OCR, and using Tesseract for the OCR itself. This is the ideal process:
- In a picture of the entire plate, crop to the general location of the serial number.
- Prepare the cropped image for OCR.
- Apply OCR.
My current process is this:
- Manually crop to serial number.
- Convert to grayscale.
- Sharpen.
- Use Canny edge detection.
- Run Tesseract OCR.
However, I am having very limited success. My main questions are:
- What sort of processing optimizes OCR? Is doing edge detection a good start?
- Can I perhaps use the stamped text's font to my advantage?
- Can I use the "color" of the text (as opposed to the gray of the metal or the black/white of the labels) to my advantage?
I feel this isn't the complete solution may be but can help -
I have been working on a similar scenario where i wanted to extract text from embossed metal.
My approach is similar to your approach -
What i have noticed is Tesseract works better when the color of text is black and background is white.(So, i am doing the 7th step)
You can see the code and results of my work here - https://github.com/DevashishPrasad/Embossed-Text-Reader
And i would also like to mention that it all depends on canny and your image. You keep threshold values low to find more edges and high to find less edges. But more edges introduce noise in the image while less edges would fail to detect whole digit. So it all depends on the canny threshold values and your image.