What does NSight Compute show for a stall reason that isn't "supported"?

794 views Asked by At

The CUDA Profiling Guide lists various reasons for sampled warp stalls, e.g. Allocation, Barrier, LG Throttle etc. And - the NSight Compute profiler shows the distribution of these as part of the profiling results.

The thing is, some of the stall reasons are listed as only being supported starting from some Compute Capability, e.g. "LG Throttle: 7.0+"

My question: What happens in devices of earlier compute capability, when the stall reason is this "unsupported" reason? Or, in other words, what is the fall-back for each of the listed stall reasons?

In Ye Olde NVIDIA Visual Profiler, we had an "Other" stall reason, see:

What are "Other" Issue Stall Reasons displayed by the Nsight profiler?

but I don't see this in (my version of) NSight Compute.

2

There are 2 answers

1
Abator Abetor On

Profiling a kernel with Nsight Compute 2019.5 on a Pascal GPU (sm_61), the metrics LG Throttle and Sleeping which require 7.0+ do not show up as stall reasons.

Warp states Pascal

2
Robert Crovella On

Nsight Compute is not supported and not the recommended profiler for GPUs with a compute capability prior to 7.0.

There is no formal definition for the behavior of the tool in an unsupported setting. Consider it UB.

Use a legacy profiler (nvvp, nvprof) for a GPU with compute capability prior to cc7.0.