How to accumulate gradient across mini-batch and then back-propagation in Chainer?

Question

How to accumulate gradient across mini-batch and then back-propagation in Chainer?

248 views Asked by machen At 23 January 2018 at 07:26

I am doing classifying video sequence, I need 2 things:

Because of limited GPU memory, I want to accumulate gradient across mini-batch, and then average gradient value, and then back propagation.
I need to know how to shuffle between mini-batch but not shuffle inside each mini-batch, because I want the video sequence keep its order.

There are 1 answers

**corochann** · Accepted Answer · 2018-02-20T12:44:38+00:00

Question 1: You can forward and backward each minibatch but not call optimizer.update(), after you have repeated forward & backward for necessary minibatches, you can call optimizer.update() to updated based on accumulated gradients.

If you want to achieve it with trainer module, I think you need to override StandardUpdater to define your own Updater class to do above.

Question 2: Are you using trainer module? If so, you can define your own iterator to achieve this. See also below for reference how to define iterator class.

TechQA.

How to accumulate gradient across mini-batch and then back-propagation in Chainer?

There are 1 answers

Related Questions in CHAINER

Related Questions in CHAINERCV

Popular Questions

Trending Questions