I would like to run multiple experiments, then report model accuracy per experiment.
I'm training a toy MNIST example with pytorch (v1.1.0), but the goal is, once I can compare performance for the toy problem, to have it integrated with the actual code base.
As I understand the TRAINS python package, with the "two lines of code" all my hyper-parameters are already logged (Command line argparse in my case).
What do I need to do in order to report a final scalar and then be able to sort through all the different training experiments (w/ hyper-parameters) in order to find the best one.
What I'd like to get, is a graph/s where on the X-axis I have hyper-parameter values and on the Y-axis I have the validation accuracy.
I assume you are referring to: https://pypi.org/project/trains/ (https://github.com/allegroai/trains), which I'm one of the maintainers.
You can manually create a plot with a single point X-axis for the hyper-parameter value, and Y-Axis for the accuracy.
Assuming your hyper-parameter is "number_layers" with current value 10, and the accuracy for the trained model is 0.95.
Then when you compare the experiments you get something like that: