im trying to implement XAI methods for a CNN and for that would like to acess higher order derivatives of the output w.r.t. to each feature map produced by the convolutional layers. PyTorch provides a nice way to acess and store the first order gradients and from my understanding I would be able to use theese to then calculate higher order derivatives, since I've retained my graph all the way. Yet it provides me with None type results when executing the following code.
def forward(self, x):
x.requires_grad_(True)
x1 = self.inc(x)
x2 = self.down1(x1)
_ = x2.register_hook(lambda grad: self._save_gradients(grad,x2)) # gradient hook
#... for each layer of interest
def _save_gradients(self, grad, feat):
self.gradients_list.append(grad.data.numpy())
second_order_grad = torch.autograd.grad(grad, feat, retain_graph=True, grad_outputs=torch.ones_like(grad), allow_unused=True)
print(second_order_grad)
self.gradients_2_list.append(second_order_grad[0].data.numpy())
third_order_grad = torch.autograd.grad(second_order_grad, feat, retain_graph=True, grad_outputs=torch.ones_like(second_order_grad), allow_unused=True)
print(second_order_grad)
self.gradients_3_list.append(third_order_grad[0].data.numpy())
Now I've tried varying applications of autograd, but they either seem to not recognize the graph, or produce None type outputs for higher order derivatives. What am I missing here?
It seems that using hooks disconnects the resulting gradients from the computational graph. Modifying the code as follows allowed me to keep the graph intact whilst accessing the gradients and higher orders of them.
and