I am using the StepLR
scheduler with the Adam
optimizer:
optimizer = torch.optim.Adam(model.parameters(), lr=LrMax, weight_decay=decay) # , betas=(args.beta1, args.beta2)
print(f'Optimizer = {repr(optimizer)}')
scheduler = StepLR(optimizer, step_size=5, gamma=0.2)
The initial learning rate lr is set to 0.1. At the end of the first epoch the situation is stable:
Train: Epoch 0 Batch=601 totalBatches=601 lr=0.1: Loss=2.5451838970184326 accuracy=0.1875
Train: Epoch 0 Batch=651 totalBatches=651 lr=0.1: Loss=3.527266025543213 accuracy=0.1875
Train: Epoch 0 Batch=656 totalBatches=656 lr=0.1: Loss=2.9547425508499146 accuracy=0.1875
But then the clock strikes midnight, we move on to epoch1 and the Learning Rate Scheduler goes bonkers:
Train: Epoch 1 Batch=45 totalBatches=701 lr=2.722258935367529e-93: Loss=2.878746271133423
accuracy=0.25
- Why did the learning rate drop by 10^92 ?
- Why did the learning rate change on the first epoch instead of epoch 5?