RuntimeError: Given input size: (64x1x1). Calculated output size: (64x0x0). Output size is too small

79 views Asked by At

My model is:

def forward(self, x):
    x = self.first_bn(x)
    x = self.selu(x)

    x0 = self.block0(x)
    y0 = self.avgpool(x0).view(x0.size(0), -1)
    y0 = self.fc_attention0(y0)
    y0 = self.sig(y0).view(y0.size(0), y0.size(1), -1)
    y0 = y0.unsqueeze(-1)
    x = x0 * y0 + y0

    x = nn.MaxPool2d(2)(x)

    x2 = self.block2(x)
    y2 = self.avgpool(x2).view(x2.size(0), -1)
    y2 = self.fc_attention2(y2)
    y2 = self.sig(y2).view(y2.size(0), y2.size(1), -1)
    y2 = y2.unsqueeze(-1)
    x = x2 * y2 + y2

    x = nn.MaxPool2d(2)(x)

    x4 = self.block4(x)
    y4 = self.avgpool(x4).view(x4.size(0), -1)
    y4 = self.fc_attention4(y4)
    y4 = self.sig(y4).view(y4.size(0), y4.size(1), -1)
    y4 = y4.unsqueeze(-1)
    x = x4 * y4 + y4

    x = nn.MaxPool2d(2)(x)

    x = self.bn_before_gru(x)
    x = self.selu(x)
    x = x.squeeze(-2)
    x = x.permute(0, 2, 1)
    self.gru.flatten_parameters()
    x, _ = self.gru(x)
    x = x[:, -1, :]
    x = self.fc1_gru(x)
    x = self.fc2_gru(x)

    return x

def _make_attention_fc(self, in_features, l_out_features):
    l_fc = []
    l_fc.append(nn.Linear(in_features=in_features, out_features=l_out_features))
    return nn.Sequential(*l_fc)

to solve RuntimeError: Given input size: (64x1x1). Calculated output size: (64x0x0). Output size is too small this error please give solution

1

There are 1 answers

0
Chih-Hao Liu On

The primary issue lies in your input size.

If you examine the SpecRNet architecture, you'll notice that it includes some MaxPool2d modules.

Let's consider an example where we input a tensor with the size (8, 1, 64, 64).

Here are the outputs of each layer within the SpecRNet.

INPUT:  torch.Size([8, 1, 64, 64])
first_bn(x):  torch.Size([8, 1, 64, 64])
selu(x):  torch.Size([8, 1, 64, 64])
block0(x):  torch.Size([8, 20, 32, 32]) ######
avgpool(x0).view(x0.size(0), -1):  torch.Size([8, 20])
fc_attention0(y0):  torch.Size([8, 20])
sig(y0).view(y0.size(0), y0.size(1), -1):  torch.Size([8, 20, 1])
unsqueeze(-1):  torch.Size([8, 20, 1, 1])
x0 * y0 + y0:  torch.Size([8, 20, 32, 32])
MaxPool2d(2)(x):  torch.Size([8, 20, 16, 16]) ######
block2(x):  torch.Size([8, 64, 8, 8]) ######
avgpool(x2).view(x2.size(0), -1):  torch.Size([8, 64])
fc_attention2(y2):  torch.Size([8, 64])
sig(y2).view(y2.size(0), y2.size(1), -1):  torch.Size([8, 64, 1])
unsqueeze(-1):  torch.Size([8, 64, 1, 1])
x2 * y2 + y2:  torch.Size([8, 64, 8, 8])
MaxPool2d(2)(x):  torch.Size([8, 64, 4, 4]) ######
block4(x):  torch.Size([8, 64, 2, 2]) ######
avgpool(x4).view(x4.size(0), -1):  torch.Size([8, 64])
fc_attention4(y4):  torch.Size([8, 64])
sig(y4).view(y4.size(0), y4.size(1), -1):  torch.Size([8, 64, 1])
unsqueeze(-1):  torch.Size([8, 64, 1, 1])
x4 * y4 + y4:  torch.Size([8, 64, 2, 2])
MaxPool2d(2)(x):  torch.Size([8, 64, 1, 1]) ######
bn_before_gru(x):  torch.Size([8, 64, 1, 1])
selu(x):  torch.Size([8, 64, 1, 1])
squeeze(-2) torch.Size([8, 64, 1])
permute(0, 2, 1):  torch.Size([8, 1, 64])
gru(x):  torch.Size([8, 1, 128])
fc1_gru(x):  torch.Size([8, 128])
fc2_gru(x):  torch.Size([8, 1])
OUTPUT:  torch.Size([8, 1])

We observe that the shape is halved after passing through block0, block2, block4, and undergoing MaxPool2d operations.

Since SpecRNet utilizes block0, block2, block4, and applies MaxPool2d 3 times, your input size should ideally be 2^6, which equals 64.

On the other hand, because you define your model architecture in config.py as

def get_specrnet_config(input_channels: int) -> Dict:
    return {
        "filts": [input_channels, [input_channels, 20], [20, 64], [64, 64]],
        "nb_fc_node": 64,
        "gru_node": 64,
        "nb_gru_layer": 2,
        "nb_classes": 1,
    }
specrnet_config = get_specrnet_config(input_channels=1)

It means that your input channel is 1.

In summation, your input size should be (batch_size,1,64,64).