GAE deploy error :No module named 'tabula'

271 views Asked by At

At first I created a new project with a Python runtime and used Flask to expose some API endpoints. One of the methods uses a Python library (tabula-py) and I've read here that because tabula-py requires Java8+, I have to go for Flexible environment with custom run time.

And so I did, I made a Dockerfile (as shown below) but unfortunately, I still get this error while deploying the app to gcloud. I have to say that locally the code works perfectly but when I use "gcloud app deploy" I get this error.

Error: While importing "main", an ImportError was raised:
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/flask/cli.py", line 240, in locate_app
__import__(module_name)
File "/main.py", line 4, in <module>
import tabula
ModuleNotFoundError: No module named 'tabula'

main.py

import tabula
.
.
.
df = tabula.read_pdf(str(latest_file), pages=1)  ## transforming into list of dataframes.

app.yaml

runtime: custom
env: flex
env_variables:
 FLASK_APP : 'main.py'

Dockerfile

FROM python:3
RUN pip uninstall tabula && \
    pip install --upgrade pip && \
    pip install --no-cache-dir Flask pyvirtualdisplay python-environ Datetime && \
    pip install --no-cache-dir glob3 pandas-gbq pandas schedule && \
    pip install --no-cache-dir tabula-py beautifulsoup4 Datetime urllib3 && \
    pip install --no-cache-dir gunicorn Werkzeug && \
    pip install --upgrade pip --user && \
    pip3 uninstall -y tabula-py && \
    pip3 install tabula-py
    ### 1. Get Linux
FROM alpine:3.7

### 2. Get Java via the package manager
RUN apk update \
&& apk upgrade \
&& apk add --no-cache bash \
&& apk add --no-cache --virtual=build-dependencies unzip \
&& apk add --no-cache curl \
&& apk add --no-cache openjdk8-jre

### 3. Get Python, PIP

RUN apk add --no-cache python3 \
&& python3 -m ensurepip \
&& pip3 install --upgrade pip setuptools \
&& rm -r /usr/lib/python*/ensurepip && \
if [ ! -e /usr/bin/pip ]; then ln -s pip3 /usr/bin/pip ; fi && \
if [[ ! -e /usr/bin/python ]]; then ln -sf /usr/bin/python3 /usr/bin/python; fi && \
rm -r /root/.cache

ENV FLASK_APP main.py
ENV FLASK_RUN_HOST 0.0.0.0
ENV FLASK_RUN_PORT 8080
### Get Flask for the app
RUN pip install --trusted-host pypi.python.org flask

####
#### OPTIONAL : 4. SET JAVA_HOME environment variable, uncomment the line below if you need it

#ENV JAVA_HOME="/usr/lib/jvm/java-1.8-openjdk"

####

EXPOSE 8080
ADD main.py /
CMD ["flask", "run"]
1

There are 1 answers

0
evyatar weiss On

So it took me a while to figure out whats wrong. but apperantly the order of the commands in the docker file is the problem.

FROM python:3
RUN pip uninstall tabula && \
    pip install --upgrade pip && \
    pip install --no-cache-dir Flask pyvirtualdisplay python-environ 
Datetime && \
    pip install --no-cache-dir glob3 pandas-gbq pandas schedule && \
    pip install --no-cache-dir tabula-py beautifulsoup4 Datetime 
urllib3 && \
    pip install --no-cache-dir gunicorn Werkzeug && \
    pip install --upgrade pip --user && \
    pip3 uninstall -y tabula-py && \
    pip3 install tabula-py
    ### 1. Get Linux

at the first part iv' installed all the python libraries but right after, i deleted all the python things that are installed ,

RUN apk add --no-cache python3 \
&& python3 -m ensurepip \
&& pip3 install --upgrade pip setuptools \
&& rm -r /usr/lib/python*/ensurepip && \
if [ ! -e /usr/bin/pip ]; then ln -s pip3 /usr/bin/pip ; fi && \
if [[ ! -e /usr/bin/python ]]; then ln -sf /usr/bin/python3 
/usr/bin/python; fi && \
rm -r /root/.cache

so the solution is to delete this part.