How to a draw a specific region in image with mouse and extract that text with Pytesseract or EasyOCR?

1k views Asked by At

For example i've an image like belowenter image description here

I want to draw a box for recognize_text() which is in the first line with mouse and extract that text with the help of Pytesseract or EasyOCR in python. Can anyone please help me with this?

There is a python code in github. I'm attaching the link below and the image. The code will enable to draw one rectangle. But i'm unable to extract text present in that rectangle.

https://github.com/arccoder/opencvdragrect

enter image description here

1

There are 1 answers

3
Ahx On

You can use image_to_data

import cv2
from numpy import array
import pytesseract
from pytesseract import Output


# Load the image
thr = cv2.imread("KKLoQ.jpg")
d = pytesseract.image_to_data(thr, config="--psm 6", output_type=Output.DICT)
n_boxes = len(d['level'])

n_boxes variable stores all the detected text regions in the given input image.

Now, for each localized region, we need to get the coordinates and draw the box (rectangle method)

for c, i in enumerate(range(n_boxes)):

    # Get the localized region
    (x, y, w, h) = (d['left'][i], d['top'][i], d['width'][i], d['height'][i])

    # Draw rectangle to the detected region
    cv2.rectangle(thr, (x, y), (x+w, y+h), (0, 0, 255), 1)

    # Crop the image
    crp = thr[y:y+h, x:x+w]

    # OCR
    txt = pytesseract.image_to_string(crp, config="--psm 6")
    print(txt)

    # Display the cropped image
    cv2.imshow("crp", crp)
    cv2.waitKey(0)

    # Debug purpose
    cv2.imwrite(f"/Desktop/cropped{c}.png", crp)

Since you want to draw the box (rectangle) around the text "recognize_text", you can do:

if "recognize " in txt:
    print(txt)

    # Display the cropped image
    cv2.imshow("crp", crp)
    cv2.waitKey(0)

    # Debug purpose
    cv2.imwrite(f"/Users/Desktop/cropped{c}.png", crp)

Results will be:

1 enter image description here
2 enter image description here

Most probably the second result is what you wanted.

Maybe you could also add another constraint:

  • if text length is equal to the two then draw the box, since "recognize text()" is two words.

You should also read the improving tesseract quality

Code:

import cv2
import pytesseract
from pytesseract import Output


# Load the image
thr = cv2.imread("KKLoQ.jpg")

# OCR
d = pytesseract.image_to_data(thr, config="--psm 6", output_type=Output.DICT)
n_boxes = len(d['level'])

for c, i in enumerate(range(n_boxes)):

    # Get the localized region
    (x, y, w, h) = (d['left'][i], d['top'][i], d['width'][i], d['height'][i])

    # Crop the image
    crp = thr[y:y+h, x:x+w]

    # OCR
    txt = pytesseract.image_to_string(crp, config="--psm 6")

    text_length = txt.split(" ")

    if "recognize" in txt and len(text_length) == 2:
        print(txt)

        # Draw rectangle to the detected region
        cv2.rectangle(thr, (x, y), (x+w, y+h), (0, 0, 255), 3)

        # Debug purpose
        cv2.imwrite(f"/Users/Desktop/cropped{c}.png", crp)

cv2.imshow("", thr)
cv2.waitKey(0)

Result:

enter image description here