Data set 1 and dataset 2 having same column names but different descriptions. In dataset 1 transformation, I would say I am working on data set 1 so it has to give preference to that data set 1 specific descriptions. If I am doing transformation for another data set, I want to give preference to that data set. Is there a way to populate column descriptions which are data set specific?
For example, the arguments in my_compute_function
is there a way to pass the dataset name which has to be given priority
Column1, Column Description for dataset 1, {Dataset 1 name}.
Column1, Column Description for dataset 2, {Dataset 2 name},
...
from transforms.api import transform, Input, Output
@transform(
my_output=Output("/my/output"),
my_input=Input("/my/input"),
)
def my_compute_function(my_input, my_output):
my_output.write_dataframe(
my_input.dataframe(),
column_descriptions={
"col_1": "col 1 description"
},
???
)
One way to do this is to provide a 'override dictionary' for all your datasets, where dataset-specific descriptions could take precedence.
i.e. you have :
This would then allow you to put
GENERAL_DESCRIPTIONS
at the root of your module and override in each transformation.py
file at the top with your 'local' descriptions. You could even put the 'local' descriptions above a group of transformations so you don't have to inspect each and every file to specify overrides.The most granular way to update the description dictionary will be to simply: