I'm trying to build a Docker image using Bazel's rules_oci like this:
oci_image(
name = "move_data_image",
base = "@python_base",
entrypoint = [
"/opt/python/dataflow/move_data",
"--worker_image",
"$WORKER_IMAGE",
"--max_num_workers",
"$MAX_NUM_WORKERS",
"--runner",
"DataflowRunner",
],
env = {
"WORKER_IMAGE": "",
"MAX_NUM_WORKERS": -1,
},
tars = [":move_data_layer"],
)
The idea was to to use env vars as arguments to the program so that it can be changed for different executions. I was able to achieve this behavior using Dockerfile:
ENV WORKER_IMAGE ""
ENV MAX_NUM_WORKERS -1
ENTRYPOINT python src/process_data.py --worker_image=$WORKER_IMAGE
--max_num_workers=$MAX_NUM_WORKERS
--runner=DataflowRunner
But I'm struggling to do this with Bazel. For the Bazel code snippet, the env variables are taken as literal strings so I would like errors like:
error: argument --max_num_workers: invalid int value: '$MAX_NUM_WORKERS'
Since rules_oci is kind of new, I wasn't able to find a lot of documentation on the correct syntax and usage. I'm wondering if this kind of use case is supported? Thanks in advance!
This is the equivalento f the "shell form" of ENTRYPOINT which you're looking for (you must have a shell in your base image though):
What you attempted is equivalent to
ENTRYPOINT ["/opt/python/dataflow/move_data", "--worker_image", "$WORKER_IMAGE", ...]
in a Dockerfile, which won't work either. You need something that's going to read environment variables.Instead of using
sh -c
, you could modify your Python code to read the environment variables directly. You could also write a wrapper (shell script, Python, or something else) that would read the environment variables and build up the command line.