In the following,
x_6 = torch.cat((x_1, x_2_1, x_3_1, x_5_1), dim=-3)
Sizes of tensors x_1, x_2_1, x_3_1, x_5_1 are
torch.Size([1, 256, 7, 7])
torch.Size([1, 256, 7, 7])
torch.Size([1, 256, 7, 7])
torch.Size([1, 256, 7, 7]) respectively.
The size of x_6 turns out to be torch.Size([1, 1024, 7, 7])
I couldn't understand & visualise this concatenation along a negative dimension(-3 in this case). What exactly is happening here? How does the same go if dim = 3? Is there any constraint on dim for a given set of tensors?
The answer by danin is not completely correct, actually wrong when looked from the perspective of tensor algebra, since the answer indicates that the problem has to do with accessing or indexing a Python list. It isn't.
The
-3
means that we concatenate the tensors along the 2nd dimension. (you could've very well used 1 instead of the confusing -3).From taking a closer look at the tensor shapes, it seems that they represent
(b, c, h, w)
whereb
stands for batch_size,c
stands for number of channels,h
stands for height andw
stands for width.This is usually the case, somewhere at the final stages of encoding (possibly) images in a deep neural network and we arrive at these feature maps.
The
torch.cat()
operation withdim=-3
is meant to say that we concatenate these 4 tensors along the dimension of channelsc
(see above).4 * 256 => 1024
Hence, the resultant tensor ends up with a shape
torch.Size([1, 1024, 7, 7])
.Notes: It is hard to visualize a 4 dimensional space since we humans live in an inherently 3D world. Nevertheless, here are some answers that I wrote a while ago which will help to get some mental picture.
tensor
in TensorFlow?