I am doing classifying video sequence, I need 2 things:
Because of limited GPU memory, I want to accumulate gradient across mini-batch, and then average gradient value, and then back propagation.
I need to know how to shuffle between mini-batch but not shuffle inside each mini-batch, because I want the video sequence keep its order.
Question 1: You can forward and backward each minibatch but not call optimizer.update(), after you have repeated forward & backward for necessary minibatches, you can call optimizer.update() to updated based on accumulated gradients.
If you want to achieve it with
trainermodule, I think you need to overrideStandardUpdaterto define your ownUpdaterclass to do above.Question 2: Are you using
trainermodule? If so, you can define your own iterator to achieve this. See also below for reference how to define iterator class.