I am using Hydra with Databricks workflows. The requirement for me is for my code to accept runtime command line parameters(NOT hydra config based parameters) from the work flow.
In my code I want to use argparse to retrieve these runtime parameters, need not be available along with the hydra decorator, but overall it is vital for me to have these runtime command line params available in my script.
My script(set_task_values.py) looks something like this:
import hydra
from omegaconf import DictConfig, OmegaConf
from hydra.core.hydra_config import HydraConfig
import argparse
@hydra.main(version_base=None, config_path="conf_tasks", config_name="config_dummy") #Hydra Decorator
def my_app(cfg : DictConfig) -> None:
parser = argparse.ArgumentParser(add_help=False)
parser.add_argument("--task_values", type=str, required=True)
args, unknown = parser.parse_known_args()
print("Task Values", "->", args.task_values)
dbutils.jobs.taskValues.set(key = "task_values", \
value = args.task_values)
if __name__ == "__main__":
my_app()
My config (config_dummy.yaml) looks something like this, although its not relevant, since there is no issue in reading and retrieving hydra config.
operand:
add
batch_size:
64
learning_rate:
0.01
creation_did:
47
custom:
email: [email protected]
exp_title: Exp_Tasks
hydra:
run:
dir: /dbfs/some_path/${custom.email}/${now:%Y-%m-%d}/${custom.exp_title}
This is how my Databricks workflow looks like:
Quick reference: Databricks workflow gives an option of "parameters", where you can input command line parameters to be passed to the task/script.
My command line parameter is this:
["--task_values","{'T1':0,'T2':1}"]
What I am trying to achieve is a set of external command line parameters as you can see in the above "parameters" section, there is a command line variable called task_values which I require in my script.
But when I run this job/workflow which internally runs my script(set_task_values.py) expecting the runtime command line parameter(task_values) to be retrieved, I get the following error:
usage: set_task_values.py [--help] [--hydra-help] [--version]
[--cfg {job,hydra,all}] [--resolve]
[--package PACKAGE] [--run] [--multirun]
[--shell-completion] [--config-path CONFIG_PATH]
[--config-name CONFIG_NAME]
[--config-dir CONFIG_DIR]
[--experimental-rerun EXPERIMENTAL_RERUN]
[--info [{all,config,defaults,defaults-tree,plugins,searchpath}]]
[overrides [overrides ...]]
set_task_values.py: error: unrecognized arguments: --task_values
/databricks/python/lib/python3.8/site-packages/IPython/core/interactiveshell.py:3445: UserWarning: To exit: use 'exit', 'quit', or Ctrl-D.
warn("To exit: use 'exit', 'quit', or Ctrl-D.", stacklevel=1)
Can someone help with how I can use runtime command line parameters with hydra?
What you are trying is not supported. Passing command line arguments along with hydra.
Refer this stack solution for more information.
But there is option in hydra to override.
Add
task_values
inconfig_dummy.yaml
file like below.And pass the parameter in task like below.
["task_values={T1: 4, T2: 2}"]
Output:
Refer this documentation for more about override.