Slurm not optimally allocating multiple GPUs

685 views Asked by At

We are using Slurm 20.02 with NVML autodetect, and on some 8-GPU nodes with NVLink, 4-GPU jobs get allocated by Slurm in a surprising way that appears sub-optimal.

On a system with 8 Nvidia A40 GPUs, 4 NVLink bridges, and two AMD EPYC 7302 CPUs, we have the following topology:

$ nvidia-smi topo -m
        GPU0    GPU1    GPU2    GPU3    GPU4    GPU5    GPU6    GPU7    CPU Affinity    NUMA Affinity
GPU0     X      NV4     SYS     SYS     SYS     SYS     SYS     SYS     12-15,44-47     3
GPU1    NV4      X      SYS     SYS     SYS     SYS     SYS     SYS     8-11,40-43      2
GPU2    SYS     SYS      X      NV4     SYS     SYS     SYS     SYS     4-7,36-39       1
GPU3    SYS     SYS     NV4      X      SYS     SYS     SYS     SYS     0-3,32-35       0
GPU4    SYS     SYS     SYS     SYS      X      NV4     SYS     SYS     28-31,60-63     7
GPU5    SYS     SYS     SYS     SYS     NV4      X      SYS     SYS     24-27,56-59     6
GPU6    SYS     SYS     SYS     SYS     SYS     SYS      X      NV4     20-23,52-55     5
GPU7    SYS     SYS     SYS     SYS     SYS     SYS     NV4      X      16-19,48-51     4

Legend:
  X    = Self
  SYS  = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)
  NV#  = Connection traversing a bonded set of # NVLinks

We see Slurm allocate 4-GPU jobs in groups such as [0,1,2,4], [1,2,3,7], [0,4,5,6] (using nvidia-smi numbering, not minor numbers, i.e., NUMA Affinity in the table above), with a pair of NVLinked GPUs and 2 unlinked GPUs.
We were expecting to see groups such as [0,1,2,3] or [0,1,4,5], with multiple pairs of NVLinked GPUs.

Some potentially relevant specs/settings:

# NVIDIA: 
Driver Version: 460.32.03    
CUDA Toolkit Version: 11.1
# slurm.conf:
SelectType=select/cons_tres
SelectTypeParameters=CR_Core
AccountingStorageTRES=gres/gpu
JobAcctGatherType=jobacct_gather/linux

Questions:

  • Is this behavior expected?
  • Is there a way to force Slurm to allocate multiple pairs of NVLinked GPUs?
0

There are 0 answers