In CNTK implementation of ADAM optimizer, how the parameters alpha, beta1, beta2 and epsilon relate to learning rate and momentum

Question

In CNTK implementation of ADAM optimizer, how the parameters alpha, beta1, beta2 and epsilon relate to learning rate and momentum

635 views Asked by Sayan Pathak At 23 December 2016 at 18:01

I am using the adam_sgd optimiser to train a neural network and I am having trouble associating the arguments in the function with the parameters reported in the paper for Adam. More specifically how do the parameters alpha, beta1, beta2 and epsilon relate to learning rate and momentum in the CNTK implementation of Adam?

Original Q&A

There are 1 answers

**Sayan Pathak** · Answer 1 · 2016-12-23T18:04:22+00:00

Sayan Pathak On 23 December 2016 at 18:04

Alpha is the learning_rate
Beta1 is momentum parameter
Beta2 is variance_momentum parameter

TechQA.

In CNTK implementation of ADAM optimizer, how the parameters alpha, beta1, beta2 and epsilon relate to learning rate and momentum

There are 1 answers

Related Questions in DEEP-LEARNING

Related Questions in CNTK

Popular Questions

Popular Tags

Trending Questions