I want to create a custom model using Pytorch, where I need to multiply inputs with a matrix containing trainable and non-trainable parameters (I'm looking to implement a trainable Kalman-filter, with free and fixed parameters). Furthermore, such matrix has the same parameter in more than one entry.
However I am struggling, (maybe too much!) in training this... any workaround?
class CustomModel(torch.nn.Module):
def __init__(self,w0):
super(CustomModel, self).__init__()
self.w = torch.nn.Parameter(data = torch.tensor([w0], dtype=torch.float32, requires_grad=True))
#self.matrix = torch.tensor(data = [[self.w, -1.],[-self.w, -1.]], dtype=torch.float32, requires_grad=True This computes \partial_matrix(COST) --> BAD
self.matrix_trainable = self.w*torch.tensor(data=[[0,1],[-1,0]], dtype=torch.float32,requires_grad=False)
self.matrix = self.matrix_trainable - torch.eye(2)
def forward(self, x):
return self.matrix.matmul(x)
def loss(pred,y):
return torch.mean((pred- y)**2)
my_model = CustomModel(w0=0.01)
optimizer = torch.optim.Adam(lr=0.01, params=my_model.parameters())
device = torch.device("cpu")
x = torch.ones(2).to(device)
y = torch.tensor(data=[2.,0.], dtype=torch.float32).to(device)
for k in range(10):
optimizer.zero_grad()
my_model.zero_grad()
pred = my_model(x)
cost = loss(pred,y)
cost.backward()
optimizer.step()
RuntimeError: Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved tensors after calling backward.
Torch version 2.0.1
Thanks a lot!
Matías
In your code I can see your purpose is going to seperate the matrix multiplication into 2 steps. Firstly, you have a trainable
A
and non-trainableB
. Then I suggest the following implementationThe
register_buffer
make the model's attributes (here, parameters) will not be return by callingmodel.parameters()
, so theB
will not be affect byoptimizer.step()
in the training process