Why different GPUs use different amounts of memory?

474 views Asked by At

I have 2 GPUs on different computers. One (NVIDIA A100) is on a server, the other (NVIDIA Quadro RTX 3000) is on my laptop. I watch the performance on both machines via nvidia-smi and noticed that the 2 GPUs use different amounts of memory when running the exact same processes (same code, same data, same CUDA version, same pytorch version, same drivers). I created a dummy script to verify this.

import torch
device = torch.device("cuda:0")
a = torch.ones((10000, 10000), dtype=float).to(device)

In nvidia-smi I can see how much memory is used for this specific python script:

  • A100: 1205 MiB
  • RTX 3000: 1651 MiB

However, when I query torch about memory usage I get the same values for both GPUs:

reserved = torch.cuda.memory_reserved(0)
allocated = torch.cuda.memory_allocated(0)

Both systems report the same usage:

  • reserved = 801112064 bytes (763 MiB)
  • allocated = 800000000 bytes (764 MiB)

I note that the allocated amount is much less than what I see used in nvidia-smi, though 763 MiB is equal to 100E6 float64 values.

Why does nvidia-smi report different memory usage on these 2 systems?

0

There are 0 answers