Unable to load Llama 2 (7b) model in Docker container

344 views Asked by At

I am trying to containerize simple Flask application which performs inference on llama-2-7b.Q5_K_M.gguf model. Flask application is running fine it loads the model successfully but when I am trying run the Flask application with Docker container, it is throwing below error.

  File "/app/app.py", line 6, in <module>
    llm_llama2_7b = AutoModelForCausalLM.from_pretrained(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/ctransformers/hub.py", line 175, in from_pretrained
    llm = LLM(
          ^^^^
  File "/usr/local/lib/python3.11/site-packages/ctransformers/llm.py", line 246, in __init__
    self._lib = load_library(lib, gpu=config.gpu_layers > 0)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/ctransformers/llm.py", line 126, in load_library
    lib = CDLL(path)
          ^^^^^^^^^^
  File "/usr/local/lib/python3.11/ctypes/__init__.py", line 376, in __init__
    self._handle = _dlopen(self._name, mode)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: /usr/local/lib/python3.11/site-packages/ctransformers/lib/basic/libctransformers.so: cannot open shared object file: No such file or directory

Code Directory

enter image description here

Dockerfile

FROM python:3.11.6-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . /app

EXPOSE 5000
ENV FLASK_APP=app.py
CMD ["flask", "run", "--host=0.0.0.0"]

app.py

from flask import Flask, request
from ctransformers import AutoModelForCausalLM

app = Flask(__name__)

llm_llama2_7b = AutoModelForCausalLM.from_pretrained("models/llama-2-7b.Q5_K_M.gguf", model_type="llama")
print("Model loaded")

@app.route('/')
def main():
    return 'Hello World'

if __name__ == '__main__':
    app.run(host='0.0.0.0')

requirements.txt

Flask
ctransformers
0

There are 0 answers