As far as I understand, we need to supply the Trainer component training examples directly (with the "examples" parameter) from the output of an ExampleGen or a Transform component, but at the same time you also need to supply it with a "module_file" that has a run_fn which should take care of training and saving the model. My question is: since the run_fn only receive what it needs in the from of a FnArgs parameter, where exactly does it get the training (or evaluation) data?
To clarify, in the official tutorial (in the section titled "Write model training code"), the run_fn relies on an _input_fn that converts the data provided in the fn_args.train_files into a Dataset object, but where exactly is the fn_args.train_files provided? does the Trainer component under the hood infer this from the examples parameter and supplies that (besides other things needed) in the from of the fn_args parameter to the run_fn? if we're already supplying the examples directly to the Trainer component why there's no mechanism for the run_fn to access those directly? It's all very confusing! :(
Thanks in advance for your help!
It takes as input the output of a data preprocessing component such as ExampleGen or Transform and the model defined in the module_file and it outputs a trained model.
It receives inputs via a fn_args parameter, which includes necessary information for the training process.
These parameters, fn_args.train_files is provided by the trainer component and contains the file paths to the training data.
Inside the run_fn, the input_fn is responsible for creating input data pipelines for training. The input_fn function reads these files and prepares the data for training.