Nsight Compute Range Replay mode usage

40 views Asked by 최보열 At 09 March 2024 at 14:01

I just want to measure performance (IPC, Throughput, Duration) of application that contain some kernels and CUDA APIs. How can I measure 'real duration' of these application using Nsight Compute or System?

I knew that Nsight Compute Profiler is not for application but for specific single kernel profile.
NVIDIA Manual also said that Nsight Compute serialize all kernels for profiling each kernel exactly.

So on, I saw NCU support 'range replay, application range replay mode' not serialize kernel and profile metrics of concurrent kernels. Is that also profile CUDA API's activity like cudaMemcpy()?

I just want to measure total performance and utilization of some application.

The meaning of total performance is not just a single kernel or average kernel performance like gpc__elapsed_cycle.max. This value means just single kernel and serialized kernel duration value. I want to expect get the real duration value when application also has async kernels or CUDA APIs.

Original Q&A

TechQA.

Nsight Compute Range Replay mode usage

There are 0 answers

Related Questions in PERFORMANCE

Related Questions in ASYNCHRONOUS

Related Questions in CUDA

Related Questions in PROFILE

Related Questions in NSIGHT-COMPUTE

Popular Questions

Trending Questions