Pytorch broadcasting not working as expected

Question

Pytorch broadcasting not working as expected

61 views Asked by Bruck At 29 January 2024 at 23:27

I am in the early stages of learning Pytorch for deep learning and have come across something I don't understand. I have written a very simple script to just make sure I fully understand the broadcasting mechanism, but I am getting an error that I find confusing.

import torch

X = torch.tensor([[1,5,2,7],[8,2,5,3]])
Y = torch.tensor([[2,9],[11,4],[9,2],[22,7]])

print(X.shape, Y.shape)

outputs

>>> torch.Size([2, 4]) torch.Size([4, 2])

But when I try to execute a basic mathematical operation on these tensors, where I would expect the broadcasting mechanism to bring them to the same size, I get the following error.

print(X + Y)

outputs

RuntimeError                              Traceback (most recent call last)
<ipython-input-7-e4a642f73c42> in <cell line: 1>()
----> 1 X + Y

RuntimeError: The size of tensor a (4) must match the size of tensor b (2) at non-singleton dimension 1

All the explanations I have seen say that the matrices simply need to be compatible for matrix multiplication. which to my knowledge in this case they are.

X = 2x4 Y = 4x2

The amount of rows to amount of columns are the same so I don't understand the error.

Original Q&A

There are 1 answers

**Oskar Hofmann** · Answer 1 · 2024-01-30T00:14:00+00:00

First of all, in PyTorch you do need to use matmul() for matrix multiplication. (I assume you are talking about multiplication even though your example uses +)

print(torch.matmul(X,  Y))

Output:

tensor([[229,  82],
        [149, 111]])

Second, this has nothing to do with broadcasting. Broadcasting is when you you have an operation that requires two tensors to be of compatible shape (usually the same) and they are not but it is possible to broadcast one of them to an equivalent shape so they are compatible.

An example from the broadcasting documentation:

x=torch.empty(5,1,4,1)
y=torch.empty(  3,1,1)
(x+y).size()

Output:

torch.Size([5, 3, 4, 1])

Another example would be adding an additional outer dimension to X in your example:

X = torch.tensor([[[1,5,2,7],[8,2,5,3]]])
Y = torch.tensor([[2,9],[11,4],[9,2],[22,7]])

print(X.shape, Y.shape)

torch.Size([1, 2, 4]) torch.Size([4, 2])

But matmul() still works as Y is broadcasted to a (1,4,2) tensor (by prepending the so called batch-dimension) leading to a (1,2,2) tensor:

print(torch.matmul(X,  Y).shape)

Output:

torch.Size([1, 2, 2])

TechQA.

Pytorch broadcasting not working as expected

There are 1 answers

Related Questions in PYTHON

Related Questions in PYTORCH

Related Questions in ARRAY-BROADCASTING

Popular Questions

Trending Questions