I am trying to train a neural network until the L2-norm of its gradient is within 10e-3 of 0; therefore, my code includes defining the parameters and gradients that are computed during the fit process. I keep hitting snags that make me think I am not getting at the parameters or gradient correctly.
Here is my code:
def get_theta(self):
theta = self.parameters().detach().cpu
return theta
def J_loss(self, xb, yb):
#forward returns x so here it will return x on GPU
#return cross_entropy result of xb and yb on GPU
return F.cross_entropy(self.forward(xb.to(device)), yb.to(device))
def fit(self, loader, epochs = 1999):
norm2Gradient = 1
while norm2Gradient >10e-3 and epochs <2000:
#grad = []
for _, batch in enumerate(loader):
x, y = batch['x'], batch['y']
#computes f.cross_entropy loss of (xb,yb) on GPU
loss = self.J_loss(x,y)
#print("loss:", loss)
#computes new gradients
grad = loss.backward()
#print("grad:",grad)
print("grad?",grad)
#takes one step along new gradients to decrease the loss; updates parameters
self.optimizer.step()
#captures new parameters
theta = self.parameters()
print("theta:",theta)
#collects gradient along new parameters
for param in theta:
grad.append(param.grad)
#computes gradient norm
norm2Gradient = torch.linalg.norm(grad)
sumNorm2Gradient += norm2Gradient.detach().cpu
#clears out old gradients
self.optimizer.zero_grad()
return sumNorm2Gradient
The current error message, "AttributeError: 'NoneType' object has no attribute 'append'" occurs at the line:
grad.append(param.grad)
Additionally, the print out of the variable "grad" is "None". I have combed through documentation trying to figure out what each line is doing in the code and how to extract the gradient and parameters. How do I correctly get at the gradient?
You defined
gradthe following way in your code:You are getting this error because
torch.Tensor.backwardreturns preciselyNone.