When I list nvprof's metrics with
nvprof --query-events
I see:
thread_inst_executed: Number of instructions executed by the active threads. For each instruction it increments by number of threads, including predicated-off threads, that execute the instruction. It does not include replays.
I would like to use this metric, so I collect metrics using:
nvprof --csv --metrics thread_inst_executed,inst_executed,inst_executed_global_loads,inst_executed_global_stores,inst_executed_local_loads,inst_executed_local_stores,inst_executed_shared_loads,inst_executed_shared_stores,gld_transactions,gst_transactions,local_load_transactions,local_store_transactions,shared_load_transactions,shared_store_transactions,l2_read_transactions,l2_write_transactions,dram_read_transactions,dram_write_transactions,sysmem_read_transactions,sysmem_write_transactions ./my_program my arguments
The output has every metric I asked for... except thread_inst_executed
. Why is it missing? How can I get it?
That isn't consistent usage (emphasis added).
Using
nvprof
(ornvvp
), events and metrics are not the same thing.To query events, you would use:
To query metrics, you would use:
To profile, asking for an event measurement, you would use
To profile, asking for a metric measurement, you would use
If you do something like this:
or
I don't know what the behavior is, but I would not expect it to work properly.