Trying to deploy ollama to google cloud run

587 views Asked by At

I am trying to deploy ollama docker image with preinstalled models to google cloud run

FROM ollama/ollama:latest

RUN /bin/sh -c "/bin/ollama serve & sleep 1 && ollama pull phi"

ENTRYPOINT ["/bin/ollama"]

CMD ["serve"]

it works fine in when i pull and run it in localhost or in cloud engine but when deployed to cloud run i am getting ollama is running for get request for the hosted end point. but when i ask the /api/generate endpoint it says model is not defined.

I want it to run the ollama model which i have predownloaded in the image but it says there is no such model. i want it to generate some text.

Any help from the community will be highly appreciated :)

1

There are 1 answers

0
KYPcode On

I am not pretty sure that downloading model (which can be huge) directly in the image is a good practice since it will hardly increase the size of your image.

What you can do is launch the serve at the beginning of the container and then pull/run the model you need:

FROM ollama/ollama:latest

ENTRYPOINT []
CMD ["/bin/sh", "-c", "/bin/ollama serve & sleep 15 && /bin/ollama run phi & wait"]

First, you will remove the "ollama" entrypoint. Then, you will be able to start the server (I waited for 15 seconds to ensure it launches when you run the model), and the && will execute "run phi" in parallel. Don't forget to include wait at the end; otherwise, it will stop immediately after.