I would like to import ocrmypdf
.
I have installed the package using pip install --upgrade --user ocrmypdf
but as I tried to import in VSC with:
import ocrmypdf
it caught error:
[WinError 2] The system cannot find the file specified
[WinError 2] The system cannot find the file specified
---------------------------------------------------------------------------
MissingDependencyError Traceback (most recent call last)
<ipython-input-9-a81f3474d7ad> in <module>
----> 1 import ocrmypdf
~\AppData\Roaming\Python\Python38\site-packages\ocrmypdf\__init__.py in <module>
8 from pluggy import HookimplMarker as _HookimplMarker
9
---> 10 from ocrmypdf import helpers, hocrtransform, leptonica, pdfa, pdfinfo
11 from ocrmypdf._concurrent import Executor
12 from ocrmypdf._jobcontext import PageContext, PdfContext
~\AppData\Roaming\Python\Python38\site-packages\ocrmypdf\leptonica.py in <module>
42 _libpath = find_library(libname)
43 if not _libpath:
---> 44 raise MissingDependencyError(
45 """
46 ---------------------------------------------------------------------
MissingDependencyError:
---------------------------------------------------------------------
This error normally occurs when ocrmypdf can't find the Leptonica
library, which is usually installed with Tesseract OCR. It could be that
Tesseract is not installed properly, we can't find the installation
on your system PATH environment variable.
The library we are looking for is usually called:
liblept-5.dll (Windows)
liblept*.dylib (macOS)
liblept*.so (Linux/BSD)
Please review our installation procedures to find a solution:
https://ocrmypdf.readthedocs.io/en/latest/installation.html
---------------------------------------------------------------------
The error log states that there is some missing dependency, which means that some module that is being used by ocrmypdf is missing. Most probably, it needs teserract OCR. Try installing that and it may work. Even the documentation of the module states that tesseract is required for the module to work properly.