In the following,
x_6 = torch.cat((x_1, x_2_1, x_3_1, x_5_1), dim=-3)
Sizes of tensors x_1, x_2_1, x_3_1, x_5_1 are
torch.Size([1, 256, 7, 7])
torch.Size([1, 256, 7, 7])
torch.Size([1, 256, 7, 7])
torch.Size([1, 256, 7, 7]) respectively.
The size of x_6 turns out to be torch.Size([1, 1024, 7, 7])
I couldn't understand & visualise this concatenation along a negative dimension(-3 in this case). What exactly is happening here? How does the same go if dim = 3? Is there any constraint on dim for a given set of tensors?
The answer by danin is not completely correct, actually wrong when looked from the perspective of tensor algebra, since the answer indicates that the problem has to do with accessing or indexing a Python list. It isn't.
The
-3means that we concatenate the tensors along the 2nd dimension. (you could've very well used 1 instead of the confusing -3).From taking a closer look at the tensor shapes, it seems that they represent
(b, c, h, w)wherebstands for batch_size,cstands for number of channels,hstands for height andwstands for width.This is usually the case, somewhere at the final stages of encoding (possibly) images in a deep neural network and we arrive at these feature maps.
The
torch.cat()operation withdim=-3is meant to say that we concatenate these 4 tensors along the dimension of channelsc(see above).4 * 256 => 1024
Hence, the resultant tensor ends up with a shape
torch.Size([1, 1024, 7, 7]).Notes: It is hard to visualize a 4 dimensional space since we humans live in an inherently 3D world. Nevertheless, here are some answers that I wrote a while ago which will help to get some mental picture.
tensorin TensorFlow?