Too much fluctuation in F1 Score curve during meta training with MAML

83 views Asked by At

I am training VGG11 on a custom image dataset for 3-way 5-shot image classification using MAML from learn2learn. I am encapsulating the whole VGG11 model with MAML, i.e., not just the classification head. My hyperparameters are as follows:

  • Meta LR: 0.001
  • Fast LR: 0.5
  • Adaptation steps: 1
  • First order: False
  • Meta Batch Size: 5
  • Optimizer: AdamW

During the training, I noticed that after taking the first outer-loop optimization step, i.e., AdamW.step(), loss skyrockets to very large values, like ten thousands. Is this normal? Also, I am measuring the micro F1 score as accuracy metric of which curve for meta training/validation is as follows: Meta Train/Val F1 Score curve

It is fluctuating too much in my opinion, is this normal? What could be the reason of this? Thanks

1

There are 1 answers

0
The Exile On BEST ANSWER

I figured it out. I was using VGG11 with vanilla BatchNorm layers from PyTorch which was not working properly in meta training setup. I removed BatchNorm layers and now it works as expected.