I just started using google performance tools (google-perftools and libgoogle-perftools4 packages in ubuntu), I swear that I'm googling for around a day and I didn't find an answer!!
The problem is that I do not get the result for ALL of my functions with CPU profiling. This is my code:
#include "gperftools/profiler.h"
#include <iostream>
#include <math.h>
using namespace std;
void bar()
{
int a,b,c,d,j,k;
a=0;
int z=0;
b = 1000;
while(z < b)
{
while (a < b)
{
d = sin(a);
c = cos(a);
j = tan(a);
k = tan(a);
k = d * c + j *k;
a++;
}
a = 0;
z++;
}
}
void foo()
{
cout << "hey " << endl;
}
int main()
{
ProfilerStart("/home/mohammad/gperf/dump.txt");
int a = 1000;
while(a--){foo();}
bar();
ProfilerFlush();
ProfilerStop();
}
Compiled as g++ test.cc -lprofiler -o a.out
this is how I run the code:
CPUPROFILE=dump.txt ./a.out
I've also tried this:
CPUPROFILE_FREQUENCY=10000 LD_PRELOAD=/usr/local/lib/libprofiler.so.0.3.0 CPUPROFILE=dump.txt ./a.out
And this is what I get from google-pprof --text a.out dump.txt:
Using local file ./a.out.
Using local file ./dump.txt.
Total: 22 samples
8 36.4% 36.4% 8 36.4% 00d8cb04
6 27.3% 63.6% 6 27.3% bar
3 13.6% 77.3% 3 13.6% __cos (inline)
2 9.1% 86.4% 2 9.1% 00d8cab4
1 4.5% 90.9% 1 4.5% 00d8cab6
1 4.5% 95.5% 1 4.5% 00d8cb06
1 4.5% 100.0% 1 4.5% __write_nocancel
0 0.0% 100.0% 3 13.6% __cos
But there is no information about the foo function!
my system information: ubuntu 12.04 g++ 4.6.3
Thats all!
TL;DR:
foois to fast and small to get profiling events, run it 100 more times. Frequency setting was with typo, andpprofwill not sample more often than CONFIG_HZ (usually 250). It is better to switch to more modern Linuxperfprofiler (tutorial from its authors, wikipedia).Long version:
Your
foofunction is just too short and simple - just call two functions. Compiled the test withg++ test.cc -lprofiler -o test.s -S -g, with filtering oftest.swithc++filtprogram to make c++ names readable:So, to see it in the profile you should run
foofor more times, changingint a = 1000;in main to something much greater, like 10000 or better 100000 (as did I for the test).Also you may fix the incorrect "
CPUPROFILE_FREQUENC=10000" to correctCPUPROFILE_FREQUENCY(note theY). I should say that 10000 is too high setting for CPUPROFILE_FREQUENCY, because it usually may only generate 1000 or 250 events per second depending on kernel configurationCONFIG_HZ(most 3.x kernels have 250, checkgrep CONFIG_HZ= /boot/config*). And default setting for CPUPROFILE_FREQUENCY in pprof is 100.I tested different values of CPUPROFILE_FREQUENCY: 100000, 10000, 1000, 250 with bash script on Ubuntu 14.04
And the results were the same 120-140 events and runtime of every ./test around 0.5 seconds, so cpuprofiler from google-perftools can't do more events per second for single thread, than CONFIG_HZ set in kernel (my has 250).
With original a=1000
fooand cout's functions runs too fast to get any profiling event (even on 250 events/s) on them in every run, so you have nofoo, nor any Input/Output functions. In small amount of runs, the__write_nocancelmay got sampling event, and thenfooand I/O functions form libstdc++ will be reported (somewhere not in the very top, so use--textoption ofpproforgoogle-pprof) with zero self event count, and non-zero child event count:With
a=100000, foo is still too short and fast to get own events, but I/O functions got several. This is list I grepped from long--textoutput:Functions with zero own counters seen only thanks to
pprofcapability of reading call chains (it knows who calls the functions which got sample, if frame info is not omitted).I can also recommend more modern, more capable (both software and hardware events, up to 5 kHz frequency or more; both user-space and kernel profiling) and better supported profiler, the Linux
perfprofiler (tutorial from its authors, wikipedia).There are results from
perfwitha=10000:To see text report from
perf.dataoutput file I'll use less (becauseperf reportby default starts interactive profile browser):Or
perf report -n | lessto see raw event (sample) counters: