I want to train the model with FP32 and perform inference with FP16.
For other networks (ResNet) with FP16, it worked.
But EDSR (super resolution) with FP16 did not work.
The differences I found are that
- ReLU with inplace=True in EDSR
- PixelShuffle in EDSR
- No batchnorm in EDSR
I am using CUDA 11.3, python 3.8.12, pytorch 1.12.1 and cudnn 8.7.0. Is there any functions that does not support FP16 in convolutional neural network?
GPU : RTX A6000
My process is like :
net_half = net.half()
net_half.eval()
input_half = input.half()
with torch.no_grad():
output_half = net_half(input_half)
I checked that there is no Nan in model parameters and input by checking
torch.stack([torch.isnan(p).any() for p in net_half.parameters()]).any()
torch.isnan(input_half).any()
gives False.
And by checking simple operations in EDSR :
x = torch.randn(1,4,Ny//2,Nx//2)
test_block1 = nn.Sequential(
nn.Conv2d(4,64,kernel_size=3,padding=1),
nn.Conv2d(64,64,kernel_size=3,padding=1,bias=True),
nn.ReLU(True),
nn.Conv2d(64,64,kernel_size=3,padding=1,bias=True),
nn.Conv2d(64,64*4,kernel_size=3,padding=1,bias=True),
nn.PixelShuffle(2),
nn.ReLU(True),
nn.Conv2d(64,4,kernel_size=3,padding=1)
)
x = x.half().to(device)
test_block1 = test_block1.half().to(device)
with torch.no_grad():
y = test_block1(x)
print(y)
It does not give any Nan values.
I don't know why but I could have got the results at epoch 1 and Nan values at epoch 4.