Error setting psm for pytesseract

1.6k views Asked by At

I'm trying to use a psm of 0 with pytesseract, but I'm getting an error. My code is:

import pytesseract
from PIL import Image
img = Image.open('pathToImage')
pytesseract.image_to_string(img, config='-psm 0')

The error that comes up is

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/site-packages/pytesseract/pytesseract.py", line 126, in image_to_string
f = open(output_file_name, 'rb')
IOError: [Errno 2] No such file or directory: 
'/var/folders/m8/pkg0ppx11m19hwn71cft06jw0000gp/T/tess_uIaw2D.txt'

When I go into '/var/folders/m8/pkg0ppx11m19hwn71cft06jw0000gp/T', there's a file called tess_uIaw2D.osd that seems to contain the output information I was looking for. It seems like tesseract is saving a file as .osd, then looking for that file but with a .txt extension. When I run tesseract through the command line with --psm 0, it saves the output file as .osd instead of .txt.

Is it correct that pytesseract's image_to_string() works by saving an output file somewhere and then automatically reading that output file? And is there any way to either set tesseract to save the file as .txt, or to set it to look for a .osd file? I'm having no issues just running the image_to_string() function when I don't set the psm.

1

There are 1 answers

0
Stephen Gemin On

You have a couple of questions here:

  1. PSM error

    • In your question you mention that you are running "--psm 0" in the command line. However in your code snip you have "-psm 0".
    • Using the double dash, config= "--psm 0", will fix that issue.
  2. If you read the tesseract command line documentation, you can specify where to output the text read from the image. I suggest you start there.

  3. Is it correct that pytesseract's image_to_string() works by saving an output file somewhere and then automatically reading that output file?

    • From my usage of tesseract, this is not how it works
    • pytesseract.image_to_string() by default returns the string found on the image. This is defined by the parameter output_type=Output.STRING, when you look at the function image_to_string.
    • The other return options include (1) Output.BYTES and (2) Output.DICT
    • I usually have something like text = pytesseract.image_to_string(img)
    • I then write that text to a log file
    • Here is an example:
import datetime
import io
import pytesseract
import cv2

img = cv2.imread("pathToImage")
text = pytesseract.image_to_string(img, config="--psm 0")
ocr_log = "C:/foo/bar/output.txt"
timestamp_fmt = "%Y-%m-%d_%H-%M-%S-%f"

# ...
# DO SOME OTHER STUFF BEFORE WRITING TO LOG FILE
# ...

with io.open(ocr_log, "a") as ocr_file:
    timestamp = datetime.datetime.now().strftime(timestamp_fmt)
    ocr_file.write(f"{timestamp}:\n====OCR-START===\n")
    ocr_file.write(text)
    ocr_file.write("\n====OCR-END====\n")