pytesseract on windows 10 : Error opening data file

2.5k views Asked by At

I am using pytesseract on windows 10 x64, and python is 3.5.2 x64,Tesseract is 4.0,the code is as follow:

# -*- coding: utf-8 -*-

try:
    import Image
except ImportError:
    from PIL import Image
import pytesseract


print(pytesseract.image_to_string(Image.open('d:/testimages/name.gif'), lang='chi_sim'))

error:

Traceback (most recent call last):
  File "D:/test.py", line 10, in <module>
    print(pytesseract.image_to_string(Image.open('d:/testimages/name.gif'), lang='chi_sim'))
  File "C:\Users\dell\AppData\Local\Programs\Python\Python35\lib\site-packages\pytesseract\pytesseract.py", line 165, in image_to_string
    raise TesseractError(status, errors)
pytesseract.pytesseract.TesseractError: (1, 'Error opening data file \\Program Files (x86)\\Tesseract-OCR\\tessdata/chi_sim.traineddata')

C:\Program Files (x86)\Tesseract-OCR\tessdata,like this:

enter image description here

why is it?

2

There are 2 answers

1
dubinglin On

If you have tessdata error like: “Error opening data file…”

tessdata_dir_config = '--tessdata-dir "<replace_with_your_tessdata_dir_path>"'
# Example config: '--tessdata-dir "C:\\Program Files (x86)\\Tesseract-OCR\\tessdata"'
# It's important to add double quotes around the dir path.

pytesseract.image_to_string(image, lang='chi_sim', config=tessdata_dir_config)
0
Nour Wolf On

Set TESSDATA_PREFIX environment variable to C:\Program Files (x86)\Tesseract-OCR\