How to show the title for nsys profile?

385 views Asked by At

I have noticed that when I use nsys in my machine

nsys profile --stats=true -o output-report ./input

It outputs the data like this:

NVIDIA Nsight Systems version 2022.4.2.50-32196742v0


[5/8] Executing 'cudaapisum' stats report

Time (%)  Total Time (ns)  Num Calls    Avg (ns)      Med (ns)     Min (ns)    Max (ns)    StdDev (ns)            Name         
 --------  ---------------  ---------  ------------  ------------  ----------  -----------  ------------  ----------------------
     46.7      100,404,793          3  33,468,264.3      22,463.0      12,434  100,369,896  57,938,512.8  cudaMallocManaged     
     39.5       84,938,847          1  84,938,847.0  84,938,847.0  84,938,847   84,938,847           0.0  cudaDeviceSynchronize 
     13.8       29,677,781          3   9,892,593.7   9,610,457.0   9,514,092   10,553,232     574,154.9  cudaFree              
      0.0           82,478          1      82,478.0      82,478.0      82,478       82,478           0.0  cuLibraryLoadData     
      0.0           40,588          1      40,588.0      40,588.0      40,588       40,588           0.0  cudaLaunchKernel      
      0.0              892          1         892.0         892.0         892          892           0.0  cuModuleGetLoadingMode

The section is described by "Executing 'cudaapisum' stats report" instead of the normal title like "CUDA API Statistics". So I'm wondering if there's a flag that I can use to output the stats like the one below:

The output below isn't from my machine, it's from AWS's machine.

NVIDIA Nsight Systems version 2021.1.1.66-6c5c5cb


CUDA API Statistics:

 Time(%)  Total Time (ns)  Num Calls    Average     Minimum    Maximum           Name         
 -------  ---------------  ---------  -----------  ---------  ---------  ---------------------
    61.5        250696605          3   83565535.0      36197  250541972  cudaMallocManaged    
    32.8        133916228          1  133916228.0  133916228  133916228  cudaDeviceSynchronize
     5.7         23226526          3    7742175.3    6373371    9064987  cudaFree             
     0.0            56395          1      56395.0      56395      56395  cudaLaunchKernel     

And the other thing I have to mention is that on my machine it automatically outputs the profile file to a .nsys-rep extension not the .qdrep extension. Are both of them the same or different?

I've been trying to find information in the nsys documentation, but I couldn't find any. I've tried searching in stackoverflow & nvidia's forum on Nsight but none came up so far. Maybe I've missed something. Any help will be appreciated.

Note: both of them is using the same command but just a slightly different file.

1

There are 1 answers

0
Zois Tasoulas On BEST ANSWER

And the other thing I have to mention is that on my machine it automatically outputs the profile file to a .nsys-rep extension not the .qdrep extension. Are both of them the same or different?

.nsys-rep is the new extension name for .qdrep files, it is the same format though. The change happened with version 2021.4.

Specifically, from the release notes of the aforementioned version:

Result file rename

  • In order to make the Nsight tools family more consistent, all versions of Nsight Systems starting with 2021.4 will use the “.nsys-rep” extension for generated report files by default.

  • Older versions of Nsight Systems used “.qdrep”.

  • Nsight Systems GUI 2021.4 and higher will continue to support opening older “.qprep” reports.

  • Versions of Nsight Systems GUI older than 2021.4 will not be able to open “.nsys-rep” reports.

Please note that the versions of the tool on your local machine and the AWS machine are different.

So I'm wondering if there's a flag that I can use to output the stats like the one below

There isn't a flag to control the output you are mentioning. You could modify your workflow slightly, profile your application without the --stats CLI switch, and collect the report file (nsys-rep/qdrep). Then you can use the nsys stats command and apply specific stats reports to your report file.

If you have feature requests for the Nsight Systems tool, please let us know through the NVIDIA Developer Forum.