Custom model in pytorch: trainable & non-trainable parameters

537 views Asked by At

I want to create a custom model using Pytorch, where I need to multiply inputs with a matrix containing trainable and non-trainable parameters (I'm looking to implement a trainable Kalman-filter, with free and fixed parameters). Furthermore, such matrix has the same parameter in more than one entry.

However I am struggling, (maybe too much!) in training this... any workaround?

class CustomModel(torch.nn.Module):
    def __init__(self,w0):
        super(CustomModel, self).__init__()
        self.w = torch.nn.Parameter(data = torch.tensor([w0], dtype=torch.float32, requires_grad=True))
    
        #self.matrix = torch.tensor(data = [[self.w, -1.],[-self.w, -1.]], dtype=torch.float32, requires_grad=True    This computes \partial_matrix(COST) --> BAD

        self.matrix_trainable = self.w*torch.tensor(data=[[0,1],[-1,0]], dtype=torch.float32,requires_grad=False)
        self.matrix = self.matrix_trainable - torch.eye(2)
    
    def forward(self, x):
        return self.matrix.matmul(x)

def loss(pred,y):
    return torch.mean((pred- y)**2)


my_model = CustomModel(w0=0.01)
optimizer = torch.optim.Adam(lr=0.01, params=my_model.parameters())

device = torch.device("cpu")
x = torch.ones(2).to(device)
y = torch.tensor(data=[2.,0.], dtype=torch.float32).to(device)


for k in range(10):

    optimizer.zero_grad()
    my_model.zero_grad()
    pred = my_model(x)
    cost = loss(pred,y)
    cost.backward()
    optimizer.step()

RuntimeError: Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved tensors after calling backward.

Torch version 2.0.1

Thanks a lot!

Matías

1

There are 1 answers

0
dungxibo123 On

In your code I can see your purpose is going to seperate the matrix multiplication into 2 steps. Firstly, you have a trainable A and non-trainable B. Then I suggest the following implementation

class CustomModel(torch.nn.Module):
    def __init__(self,w0):
        super(CustomModel, self).__init__()
        self.A = torch.nn.Parameter(data = torch.tensor([w0], dtype=torch.float32, requires_grad=True))

        B = torch.tensor([[1,0],[0,1]]) # or any values you want
        self.register_buffer("B", B, persistent=False)
    
    def forward(self, x):
        return self.B @ self.A @ x # or something like this


The register_buffer make the model's attributes (here, parameters) will not be return by calling model.parameters(), so the B will not be affect by optimizer.step() in the training process