cachegrind counts do not reflect real performance

Question

cachegrind counts do not reflect real performance

223 views Asked by Arek' Fu At 13 October 2012 at 13:51

Two versions of the same algorithm yield different total instruction fetch counts and cycle estimations under valgrind/cachegrind. The difference is about 25%. Process timing, however, is very similar (it is actually shorter for the cachegrind-slow version):

version 1:

Ir:     146,328,018,245
CEst:   152,553,736,055
timing: 17.93 s

version 2:

Ir:     185,221,836,610
CEst:   197,531,381,950
timing: 17.53 s

Is this behaviour expected? How can I learn more about why version 1 is slower?

Original Q&A

There are 1 answers

**Arek' Fu** · Accepted Answer · 2012-10-18T12:46:54+00:00

Arek' Fu On 18 October 2012 at 12:46 BEST ANSWER

I discovered that the inconsistency is due to the different compiler options used for the cachegrind runs and for the timing runs. In particular, I had disabled function inlining for the cachegrind runs (so that I could get meaningful per-function counts).

TechQA.

cachegrind counts do not reflect real performance

There are 1 answers

Related Questions in OPTIMIZATION

Related Questions in VALGRIND

Related Questions in CACHEGRIND

Popular Questions

Popular Tags

Trending Questions