How to accumulate gradient across mini-batch and then back-propagation in Chainer?

248 views Asked by At

I am doing classifying video sequence, I need 2 things:

  1. Because of limited GPU memory, I want to accumulate gradient across mini-batch, and then average gradient value, and then back propagation.

  2. I need to know how to shuffle between mini-batch but not shuffle inside each mini-batch, because I want the video sequence keep its order.

1

There are 1 answers

2
corochann On BEST ANSWER

Question 1: You can forward and backward each minibatch but not call optimizer.update(), after you have repeated forward & backward for necessary minibatches, you can call optimizer.update() to updated based on accumulated gradients.

If you want to achieve it with trainer module, I think you need to override StandardUpdater to define your own Updater class to do above.

Question 2: Are you using trainer module? If so, you can define your own iterator to achieve this. See also below for reference how to define iterator class.