I'm using pytorch-gradcam open source for custom trained ViT model.
Everything seems to work fine but the result of GradCam shows lines like this example. gradcam result Any ideas why?
FYI: I'm using audio dataset that is converted with torchaudio Fbank
I have followed ViT tutorial example. The data is one channel and changed to 3 channel by applying color with cv2.