How can I measure time duration of memory allocation in C++?

149 views Asked by At

Any allocation of integers I try (varying the size of the array), the resulted time is still 0 NANOseconds.

Here is the code:

#include <iostream>
#include <chrono>


using namespace std;
using namespace chrono;

int* memory_allocation()
{
  int *allocated_ints = new int[1000000];
  return allocated_ints;
}


int main()
{
   high_resolution_clock::time_point t0 = high_resolution_clock::now();

   int *p = memory_allocation();

   high_resolution_clock::time_point t1 = high_resolution_clock::now();
   
   nanoseconds ns = duration_cast<nanoseconds>(t1 - t0);
   std::cout << ns.count() << "ns\n";

   delete[] p;

   return 0;
}

Also, I tried encapsulating the memory allocation in a 1000000 million for loop, but the issue is that, I also have to deallocate it because of the heap space, resulting in also measuring the time for the 'delete' operation.

Is it really possible to be this small? If yes, is there any other way I can measure it?

1

There are 1 answers

0
Alan Milton On

Depending on your CPU speed, actual time resolution and some other platform specifics the allocation can be that fast - faster than your hi-res timer resolution.

As it has been mentioned here, it is not allocation of physical storage. The memory is not actually reserved for your process immediately after you call new[]. It is only allocation within your process virtual memory space. That is why it is fast.

Depending on your platform, you may be able to use CPU hardware clock to measure the time more precisely. On x86 architecture there is RDTSC command which provides access to the CPU ticks counter. The problem with it is that modern CPUs change their clock speed depending on the load. Anyway, it is the most precise thing you could use. In C++ code it can be accessed through ASM block or through intrinsic functions like __rdtsc(). Here is the code specific to Visual Studio.

#include <iostream>
#include <chrono>
#include <limits>
#include <algorithm>
#include <intrin.h> // Platform specific, gives __rdtsc()

using namespace std;
using namespace chrono;

auto memory_allocation(size_t n) -> auto
{
    auto memory = new byte[n];
    return memory;
}

int main()
{
    // Estimate the best time clock resolution
    auto t0 = high_resolution_clock::now();
    decltype(t0) t1;
    do
    {
        t1 = high_resolution_clock::now();
    } while (t1 == t0);

    auto resolution = duration_cast<nanoseconds>(t1 - t0);
    cout << "Time clock resolution: " << resolution << endl;

    for (size_t n = 0; n <= 1'000'000'000; n = n ? n * 10 : 1)
    {
        auto best_dx = numeric_limits<decltype(__rdtsc())>::max(); // the type is uint64
        auto best_ns = high_resolution_clock::duration::max();

        for (int i = 0; i != 1000; ++i)
        {
            auto t0 = high_resolution_clock::now();
            auto x = __rdtsc();

            auto p = memory_allocation(n);

            auto dx = __rdtsc() - x;
            auto t1 = high_resolution_clock::now();

            delete[] p;

            auto ns = duration_cast<nanoseconds>(t1 - t0);
            best_dx = min(best_dx, dx);
            best_ns = min(best_ns, ns);
        }

        cout << n << ", " << best_dx << ", " << best_ns << endl;
    }

    return 0;
}

Below is the result from my machine:

Time resolution: 100ns
0, 72, 0ns
1, 72, 0ns
10, 90, 0ns
100, 290, 100ns
1000, 306, 100ns
10000, 306, 100ns
100000, 264, 100ns
1000000, 234, 100ns
10000000, 9242, 3200ns
100000000, 10462, 3600ns
1000000000, 22240, 7600ns