I have built a small utility using Python 3.8. Among other things it extracts some data from XML files using beautifulsoup4 and lxml. I use PyCharm and virtualenv for development and my utility works just fine.
In order to distribute the util to others I have a build script that copies my code to a dist
directory and install all dependencies into that directory using pip install -r requirements.txt -t dist
. This also works fine and I can run the code in the dist
directory from my system interpreter (3.8, no beautifulsoup, no lxml). The dependencies can be loaded from dist
, it appears.
It doesn't work on other machines, though. The script produces the error message
Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?
Which means that beautifulsoup4 can't find lxml (same with "lxml-xml" or "xml"). The dependencies in the dist
dir appear to be correct, though. Nothing seems to be missing. I get the same error when I package the script as a zip app using python -m zipapp -p "python" dist
, which yields a file dist.pyz
. It can be executed but runs into the same error message, on my own machine.
This is my requirements.txt file:
beautifulsoup4
jinja2
lxml
And this is the instantiation of the BeautifulSoup parser:
soup = BeautifulSoup(xml_data, features='lxml')
xml_data is just a string containing some valid XML that is read from a file generated by another tool.
I am out of ideas. I have lots of experience with .NET and Java but am not the greatest Python coder on the planet. It seems that I have entered the Python version of dependency hell... I really don't want to have users of the scripts invoke pip install lxml
on their machines. I want to distribute a self-contained app with all dependencies.
Any help is appreciated.
Update
The order of the entries in requirements.txt makes no difference (as I had hoped).
I added
from lxml.builder import ElementMaker
...
e = ElementMaker()
to the main script in order to import lxml into my script. This yields the error
Traceback (most recent call last): File "C:\Program Files\Python38\lib\runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Program Files\Python38\lib\runpy.py", line 87, in run_code exec(code, run_globals) File "dist.pyz_main.py", line 4, in File "", line 259, in load_module File "dist.pyz\lrg.py", line 3, in File "", line 259, in load_module File "dist.pyz\lxml\builder.py", line 44, in ModuleNotFoundError: No module named 'lxml.etree'
when run as a zip app but works fine from my IDE that uses a virtualenv.