Knative - kubernetes yaml to mount data from google cloud storage

806 views Asked by At

I am an newbie for Kubernetes and Cloudrun deployment using YAML file, So pardon if this question should very basic.

Problem: I have files that are stored in cloud storage. I want to download these files in the local mount before the container spins up my docker entrypoint.

It is my understanding that KNative does not support volumes or persistentVolumeClaims.

Please correct me if this understanding is wrong.

Let me explain it better using an image below, enter image description here

Inside the Kubernetes pod, I have divided the container startup into 3 section.

  1. Prehook to download files from GCS (Google Cloud Storage) -> This will copy files from google storage to local mount. Possible using some kind of init container with cloudsdk image and then gsutils to copy the files down.
  2. Local Mount Filesystem -> The prehook will write into this mount. The container having the "container Image" will also have access to this mount.
  3. container Image -> This is my main container image running in the container.

I am looking for Knative serving solution that will work on cloudrun. How do I solve this?

Additional, is it possible to have yaml file without Knative serving for creating an cloudrun service?

1

There are 1 answers

4
guillaume blaquiere On BEST ANSWER

Knative contract, as you said, doesn't allow to mount or claim a volume. So, you can't achieve this (for now, on Cloud Run managed).

On the other hand, Pod allow this, but Knative is a special version of "Pod": no persistent volume, and you can't define a list of container, it's a pod with only one container (+ the mesh (most of the time Istio) sidecar injected when you deploy)

For you additional question, Cloud Run implements Knative API. And thus, you need to present Knative serving YAML file to configure your service.


If you want to write file, you can do it in the /tmp in memory partition. So, at your container starts, download the file and store them there. However, if you update the files and you need to push the update, you need to push them manually to Cloud Storage.

In addition, the other running instances, that have already downloaded the file and stored them in their /tmp directory won't see the file change in Cloud Storage; it's only the new instances.

UPDATE 1:

If you want to download the files "before" the container start, you have 2 solutions:

  1. "Before" is not possible, you can do this at startup:
  • The container start
  • Download the files, initiate what you need
  • Serve traffic with your webserver.

The previous solution has 2 issues

  • The service cold start is impacted by the download before serving
  • The max size of file is limited by the max memory size of the instance (/tmp directory is an in-memory file system. If you have a config of 2Gb, the size is max 2Gb minus the memory footprint of your app)
  1. The second solution is to build your container with the file already present in the container image.
  • No cold start impact
  • No memory limitation
  • Reduce your agility and you need to build and deploy a new revision with a file change.

UPDATE 2:

For the solution1, it's not a Knative solution, it's in your code! I don't know your language and framework, but at startup, you need to use Google Cloud Storage client library to download, from your code, the file that you need.

Show me your server startup I could try to provide you an example!

For the solution 2, the files aren't in your git repo, but still in your Cloud Storage. Your Docker file can look like to this

FROM google/cloud-sdk:alpine as gcloud
WORKDIR /app

# IF you aren't building your image on Cloud Build, you need to be authenticated
#ARG KEY_FILE_CONTENT
#RUN echo $KEY_FILE_CONTENT | gcloud auth activate-service-account --key-file=- && \

# Get the file(s)
gsutil cp gs://my-bucket/name.csv .

FROM golang:1.15-buster as builder

WORKDIR /app
COPY go.* ./
....
RUN go build -v -o server

FROM debian:buster-slim

# Copy the binary to the production image from the builder stage.
COPY --from=builder /app/server /app/server
COPY --from=gcloud /app/name.csv /app/name.csv

# Run the web service on container startup.
CMD ["/app/server"]

You can also imagine downloading the file before the Docker build command and simply perform a copy in the Dockerfile. I don't know your container creation pipeline, but it's ideas that you can reuse!