I am trying to deploy ollama docker image with preinstalled models to google cloud run
FROM ollama/ollama:latest
RUN /bin/sh -c "/bin/ollama serve & sleep 1 && ollama pull phi"
ENTRYPOINT ["/bin/ollama"]
CMD ["serve"]
it works fine in when i pull and run it in localhost
or in cloud engine
but when deployed to cloud run
i am getting ollama is running
for get request for the hosted end point. but when i ask the /api/generate
endpoint it says model is not defined.
I want it to run the ollama model which i have predownloaded in the image but it says there is no such model. i want it to generate some text.
Any help from the community will be highly appreciated :)
I am not pretty sure that downloading model (which can be huge) directly in the image is a good practice since it will hardly increase the size of your image.
What you can do is launch the serve at the beginning of the container and then pull/run the model you need:
First, you will remove the "ollama" entrypoint. Then, you will be able to start the server (I waited for 15 seconds to ensure it launches when you run the model), and the
&&
will execute "run phi" in parallel. Don't forget to includewait
at the end; otherwise, it will stop immediately after.