I'm using Azure Machine Learning (Azure ML) to manage my machine learning workflows, and I want to set up dataset scheduling based on trigger time. The dataset I'm working with has a different format than the trigger time. For example, my dataset has the format "path_on_datastore/2023/01/01/some_data.tsv", while the trigger time format is different.
I have discovered that the scheduling function supports the use of "${{creation_context.trigger_time}}" as a PipelineParameter,(link: https://learn.microsoft.com/en-us/azure/machine-learning/how-to-schedule-pipeline-job?view=azureml-api-2&tabs=cliv2#expressions-supported-in-schedule) but the format it provides doesn't match the format of my dataset. I try to use the components to do that, but the components only support outputting the dataset. Is there a way to customize the format or adapt the trigger time format to match my dataset format?
You can use
PythonScriptStep
class in Azure Machine Learning to execute a python script to get formatted data path based on trigger. Example: Python script file (script.py
):With the script you can create a pipeline:
Then you can schedule the pipeline:
To disable or update the schedule:
Above example explain how you can use PythonScriptStep` class and current time in datetime as trigger time. For more information, please refer to this. Note: Make sure to make changes in python script and datastore paths as necessary.