Whenever I call:
std::chrono::high_resolution_clock::now().time_since_epoch().count();
The assembly instructions for it are:
std::chrono::high_resolution_clock::now().time_since_epoch().count();
00007FF7D9E11840 call qword ptr [__imp__Query_perf_frequency (07FF7D9E14090h)]
00007FF7D9E11846 call qword ptr [__imp__Query_perf_counter (07FF7D9E140A0h)]
I've used the Windows API clock before and I thought the right way was to query the frequency once.
On Microsoft documentation it says:
QueryPerformanceFrequency Retrieves the frequency of the performance counter. The frequency of the performance counter is fixed at system boot and is consistent across all processors. Therefore, the frequency need only be queried upon application initialization, and the result can be cached.
This was in a loop so I think the call to QueryPerformanceFrequency is done repeatedly. This was building in Release mode and with /O2 optimization.
Also, if I build in Debug mode it makes the following assembly:
std::chrono::high_resolution_clock::now().time_since_epoch().count();
00007FF774FC9D19 lea rcx,[rbp+398h]
00007FF774FC9D20 call std::chrono::steady_clock::now (07FF774FB1226h)
00007FF774FC9D25 lea rdx,[rbp+3B8h]
00007FF774FC9D2C mov rcx,rax
00007FF774FC9D2F call std::chrono::time_point<std::chrono::steady_clock,std::chrono::duration<__int64,std::ratio<1,1000000000> > >::time_since_epoch (07FF774FB143Dh)
00007FF774FC9D34 mov rcx,rax
00007FF774FC9D37 call std::chrono::duration<__int64,std::ratio<1,1000000000> >::count (07FF774FB1361h)
I don't understand assembly, and I don't know why in the Release mode there are calls to the Windows API and in the Debug mode there is no mention of it. Also, I am on Visual Studio.
Thanks.
The optimizer of VS doesn't seem to put the call of
QueryPerformanceFrequency
outside of the loop. It doesn't recognize that the output is always the same on every iteration after the first one, and so it can't optimize it, which any sane optimizer would do :)Probably a missing feature or something rather than a bug I think, as I would say that VS optimizes the call to
foo
here outside of the loop (I don't have access to VS at the moment, so I can't test):The reason why there is no call to the
QueryPerformance*
functions is that in Debug, the optimizer isn't allowed to optimize. The optimizer sees that a call to the native Windows API is faster than the call to the standard library implementation, and so it replaces the respective calls.