Higher order derivatives w.r.t. singular feature map in CNN architecture using PyTorch

Question

Higher order derivatives w.r.t. singular feature map in CNN architecture using PyTorch

23 views Asked by Get It right At 12 December 2023 at 12:15

im trying to implement XAI methods for a CNN and for that would like to acess higher order derivatives of the output w.r.t. to each feature map produced by the convolutional layers. PyTorch provides a nice way to acess and store the first order gradients and from my understanding I would be able to use theese to then calculate higher order derivatives, since I've retained my graph all the way. Yet it provides me with None type results when executing the following code.

def forward(self, x):
        x.requires_grad_(True)
        x1 = self.inc(x)
        x2 = self.down1(x1)
        _ = x2.register_hook(lambda grad: self._save_gradients(grad,x2))  # gradient hook
#... for each layer of interest


    def _save_gradients(self, grad, feat):
        self.gradients_list.append(grad.data.numpy())

        second_order_grad = torch.autograd.grad(grad, feat, retain_graph=True,     grad_outputs=torch.ones_like(grad), allow_unused=True)
        print(second_order_grad)
        self.gradients_2_list.append(second_order_grad[0].data.numpy())

        third_order_grad = torch.autograd.grad(second_order_grad, feat, retain_graph=True, grad_outputs=torch.ones_like(second_order_grad), allow_unused=True)
        print(second_order_grad)
        self.gradients_3_list.append(third_order_grad[0].data.numpy())

Now I've tried varying applications of autograd, but they either seem to not recognize the graph, or produce None type outputs for higher order derivatives. What am I missing here?

Original Q&A

There are 1 answers

**Get It right** · Answer 1 · 2023-12-14T11:40:29+00:00

It seems that using hooks disconnects the resulting gradients from the computational graph. Modifying the code as follows allowed me to keep the graph intact whilst accessing the gradients and higher orders of them.

    ...
    x3 = self.down2(x2)
    x3.requires_grad_(True)
    self.features["down2"] = x3

    x4 = self.down3(x3)
    x4.requires_grad_(True)
    self.features["down3"] = x4
    
    x5 = self.down4(x4)
    x5.requires_grad_(True)
    self.features["down4"] = x5
    ...

and

    def calc_gradients(self, score):

    for name in self.features:
        print(name)
        grad = torch.autograd.grad(score, self.features[name], grad_outputs=torch.ones_like(score), retain_graph=True, create_graph=True)
        self.gradients[name] = grad
        print(name, "grad_2")
        grad_2 = torch.autograd.grad(grad, self.features[name], grad_outputs=torch.ones_like(grad[0]), retain_graph=True, create_graph=True)
        self.gradients_2[name] = grad_2[0]
        print(name, "grad_3")
        try:
            grad_3 = torch.autograd.grad(grad_2, self.features[name], grad_outputs=torch.ones_like(grad_2[0]), retain_graph=True)
            self.gradients_3[name] = grad_3
        except:
            grad_3 = torch.autograd.grad(grad_2, self.features[name], grad_outputs=torch.ones_like(grad_2[0]), retain_graph=True, allow_unused=True)
            self.gradients_3[name] = (torch.zeros_like(grad_2[0]),)
        print(name, "Made it through")

TechQA.

Higher order derivatives w.r.t. singular feature map in CNN architecture using PyTorch

There are 1 answers

Related Questions in PYTORCH

Related Questions in CONV-NEURAL-NETWORK

Related Questions in AUTOGRAD

Related Questions in XAI

Popular Questions

Popular Tags

Trending Questions