I just started using google performance tools (google-perftools
and libgoogle-perftools4
packages in ubuntu), I swear that I'm googling for around a day and I didn't find an answer!!
The problem is that I do not get the result for ALL of my functions with CPU profiling. This is my code:
#include "gperftools/profiler.h"
#include <iostream>
#include <math.h>
using namespace std;
void bar()
{
int a,b,c,d,j,k;
a=0;
int z=0;
b = 1000;
while(z < b)
{
while (a < b)
{
d = sin(a);
c = cos(a);
j = tan(a);
k = tan(a);
k = d * c + j *k;
a++;
}
a = 0;
z++;
}
}
void foo()
{
cout << "hey " << endl;
}
int main()
{
ProfilerStart("/home/mohammad/gperf/dump.txt");
int a = 1000;
while(a--){foo();}
bar();
ProfilerFlush();
ProfilerStop();
}
Compiled as g++ test.cc -lprofiler -o a.out
this is how I run the code:
CPUPROFILE=dump.txt ./a.out
I've also tried this:
CPUPROFILE_FREQUENCY=10000 LD_PRELOAD=/usr/local/lib/libprofiler.so.0.3.0 CPUPROFILE=dump.txt ./a.out
And this is what I get from google-pprof --text a.out dump.txt
:
Using local file ./a.out.
Using local file ./dump.txt.
Total: 22 samples
8 36.4% 36.4% 8 36.4% 00d8cb04
6 27.3% 63.6% 6 27.3% bar
3 13.6% 77.3% 3 13.6% __cos (inline)
2 9.1% 86.4% 2 9.1% 00d8cab4
1 4.5% 90.9% 1 4.5% 00d8cab6
1 4.5% 95.5% 1 4.5% 00d8cb06
1 4.5% 100.0% 1 4.5% __write_nocancel
0 0.0% 100.0% 3 13.6% __cos
But there is no information about the foo function!
my system information: ubuntu 12.04 g++ 4.6.3
Thats all!
TL;DR:
foo
is to fast and small to get profiling events, run it 100 more times. Frequency setting was with typo, andpprof
will not sample more often than CONFIG_HZ (usually 250). It is better to switch to more modern Linuxperf
profiler (tutorial from its authors, wikipedia).Long version:
Your
foo
function is just too short and simple - just call two functions. Compiled the test withg++ test.cc -lprofiler -o test.s -S -g
, with filtering oftest.s
withc++filt
program to make c++ names readable:So, to see it in the profile you should run
foo
for more times, changingint a = 1000;
in main to something much greater, like 10000 or better 100000 (as did I for the test).Also you may fix the incorrect "
CPUPROFILE_FREQUENC=10000
" to correctCPUPROFILE_FREQUENCY
(note theY
). I should say that 10000 is too high setting for CPUPROFILE_FREQUENCY, because it usually may only generate 1000 or 250 events per second depending on kernel configurationCONFIG_HZ
(most 3.x kernels have 250, checkgrep CONFIG_HZ= /boot/config*
). And default setting for CPUPROFILE_FREQUENCY in pprof is 100.I tested different values of CPUPROFILE_FREQUENCY: 100000, 10000, 1000, 250 with bash script on Ubuntu 14.04
And the results were the same 120-140 events and runtime of every ./test around 0.5 seconds, so cpuprofiler from google-perftools can't do more events per second for single thread, than CONFIG_HZ set in kernel (my has 250).
With original a=1000
foo
and cout's functions runs too fast to get any profiling event (even on 250 events/s) on them in every run, so you have nofoo
, nor any Input/Output functions. In small amount of runs, the__write_nocancel
may got sampling event, and thenfoo
and I/O functions form libstdc++ will be reported (somewhere not in the very top, so use--text
option ofpprof
orgoogle-pprof
) with zero self event count, and non-zero child event count:With
a=100000
, foo is still too short and fast to get own events, but I/O functions got several. This is list I grepped from long--text
output:Functions with zero own counters seen only thanks to
pprof
capability of reading call chains (it knows who calls the functions which got sample, if frame info is not omitted).I can also recommend more modern, more capable (both software and hardware events, up to 5 kHz frequency or more; both user-space and kernel profiling) and better supported profiler, the Linux
perf
profiler (tutorial from its authors, wikipedia).There are results from
perf
witha=10000
:To see text report from
perf.data
output file I'll use less (becauseperf report
by default starts interactive profile browser):Or
perf report -n | less
to see raw event (sample) counters: