I'm running a pipeline job with a sweep step in AzureML.
I'm using the CLI syntax to create the components. I have a pipeline.yaml file that refers to train.yaml as the trial. train.yaml in turn calls train.py.
The job fails after launch because the path that I'm passing it as an argument within train.py, which should be azureml:dataset_name:1, is None.
To test what is going on, I logged the other arguments passed. And I discovered that even though I specify the parameters you see below:

in my logger, the values are those that coincide with the defaults in my add_argument definition within train.py (the below is from the log of one of the child runs in the sweep):

So let's ignore the question of the path being None for now. I want to understand why the values from pipeline.yaml aren't being passed correctly. I triple-checked my arguments and the names match between the 3 files.
For reference:
pipeline.yaml
$schema: https://azuremlschemas.azureedge.net/latest/pipelineJob.schema.json
type: pipeline
display_name: test_sweep
description: Tune hyperparameters
settings:
default_compute: azureml:test-compute
jobs:
sweep_step:
type: sweep
inputs:
data_path:
type: uri_file
path: azureml:test_data:1
seq_length: 100
epochs: 1
outputs:
model_output:
sampling_algorithm: random
search_space:
batch_size:
type: choice
values: [1, 5, 10, 15]
learning_rate:
type: loguniform
min_value: -6.90775527898
max_value: -2.30258509299
trial: ./train.yaml
objective:
goal: maximize
primary_metric: bleu_score
limits:
max_total_trials: 5
max_concurrent_trials: 3
timeout: 3600 # 1 hour
trial_timeout: 720 # 20 mins
train.yaml
$schema: https://azuremlschemas.azureedge.net/latest/commandComponent.schema.json
type: command
name: train_model
display_name: train_model
version: 1
inputs:
data_path:
type: uri_file
batch_size:
type: integer
learning_rate:
type: number
seq_length:
type: integer
epochs:
type: integer
outputs:
model_output:
type: mlflow_model
code: ../src
environment: azureml:test_env:2
command: >-
python train.py
--data_path ${{inputs.data_path}}
--output_path ${{outputs.model_output}}
--batch_size ${{inputs.batch_size}}
--learning_rate ${{inputs.learning_rate}}
--seq_length ${{inputs.seq_length}}
--epochs ${{inputs.epochs}}
train.py (just the relevant part)
import argparse
import logging
def main(args):
data_path = args.data_path
output_path = args.output_path
batch_size = args.batch_size
seq_length = args.seq_length
epochs = args.epochs
learning_rate = args.learning_rate
handler = logging.StreamHandler()
logger = logging.getLogger(__name__)
logger.addHandler(handler)
logger.setLevel(logging.INFO)
logger.info(f'path: {data_path}')
logger.info(f'batch_size: {batch_size}')
logger.info(f'seq_length: {seq_length}')
logger.info(f'epochs: {epochs}')
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument("--data_path", type=str)
parser.add_argument("--output_path")
parser.add_argument("--batch_size", type=int, default=5)
parser.add_argument("--learning_rate", type=float, default=1e-5)
parser.add_argument("--seq_length", type=int, default=500)
parser.add_argument("--epochs", type=int, default=3)
args = parser.parse_args()
return args
if __name__ == "__main__":
args = parse_args()
main(args)