jemalloc performance degrades heavily when allocating memory of large size

50 views Asked by At

I have a request of allocating moemory of megabyte size many times, and for performance's consideration, I choose jemalloc to replace glibc's malloc. But test result shows that when allocating megabyte block, jemalloc is much slower.

My test code is just as following: (connallmalloc is just an alias for jemalloc, mimalloc is an implementation of MicroSoft)

void libc_allocate(std::size_t size)
{
    int *ptr = (int *)malloc(size);
    *ptr = 1;
    free(ptr);
}

void mim_allocate(std::size_t size)
{
    int *ptr = (int *)mi_malloc(size);
    *ptr = 1;
    mi_free(ptr);
}

void con_allocate(std::size_t size)
{
    int *ptr = (int *)connallmalloc(size);
    *ptr = 1;
    connallfree(ptr);
}

void repeat_test(int times, void (*func)(std::size_t), std::string tips, std::size_t size)
{
    auto start_time = std::chrono::high_resolution_clock::now();

    for (int i = 0; i < times; ++i)
    {
        func(size);
    }

    auto end_time = std::chrono::high_resolution_clock::now();

    auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(end_time - start_time).count();

    int b = size % 1024;
    int kb = (size / 1024) % 1024;
    int mb = size / 1024 / 1024;

    std::cout << tips << ", allocate size: " << (mb > 0 ? std::to_string(mb) + "MB " : "")
              << (kb > 0 ? std::to_string(kb) + "kB " : "") << (b > 0 ? std::to_string(b) + "B " : "") << " allocate time: " << duration << std::endl;
}


int main(int, char **)
{
    int repeat_times = 10000000;

    const int test_size = 11;

    int sizes[test_size] = {16, 64, 256, 1024, 4096, 16384, 65536, 262144, 1048576, 4194304, 16777216};

    // std::cout << sizeof(Data) << std::endl;

    for (int i = 0; i < test_size; ++i)
    {
        repeat_test(repeat_times, libc_allocate, "libc allocate", sizes[i]);

        repeat_test(repeat_times, con_allocate, "conn allocate", sizes[i]);

        repeat_test(repeat_times, mim_allocate, "mima allocate", sizes[i]);

        std::cout<< "\n";
    }
}

here is the test result: libc allocate, allocate size: 16B allocate time: 114 conn allocate, allocate size: 16B allocate time: 101 mima allocate, allocate size: 16B allocate time: 114

libc allocate, allocate size: 64B allocate time: 115 conn allocate, allocate size: 64B allocate time: 100 mima allocate, allocate size: 64B allocate time: 115

libc allocate, allocate size: 256B allocate time: 146 conn allocate, allocate size: 256B allocate time: 102 mima allocate, allocate size: 256B allocate time: 143

libc allocate, allocate size: 1kB allocate time: 96 conn allocate, allocate size: 1kB allocate time: 109 mima allocate, allocate size: 1kB allocate time: 95

libc allocate, allocate size: 4kB allocate time: 212 conn allocate, allocate size: 4kB allocate time: 131 mima allocate, allocate size: 4kB allocate time: 211

libc allocate, allocate size: 16kB allocate time: 214 conn allocate, allocate size: 16kB allocate time: 267 mima allocate, allocate size: 16kB allocate time: 211

libc allocate, allocate size: 64kB allocate time: 235 conn allocate, allocate size: 64kB allocate time: 2428 mima allocate, allocate size: 64kB allocate time: 232

libc allocate, allocate size: 256kB allocate time: 2210 conn allocate, allocate size: 256kB allocate time: 2425 mima allocate, allocate size: 256kB allocate time: 2203

libc allocate, allocate size: 1MB allocate time: 2232 conn allocate, allocate size: 1MB allocate time: 2399 mima allocate, allocate size: 1MB allocate time: 2228

libc allocate, allocate size: 4MB allocate time: 2536 conn allocate, allocate size: 4MB allocate time: 2386 mima allocate, allocate size: 4MB allocate time: 2542

libc allocate, allocate size: 16MB allocate time: 4077 conn allocate, allocate size: 16MB allocate time: 35819 mima allocate, allocate size: 16MB allocate time: 4076

environment: Ubuntu 22.04 Intel(R) Core(TM) i5-8500 CPU @ 3.00GHz RAM 16GB

Is there some parameters that I can tune to improve performance of allocate block of megabyte in jemalloc? I read the doc but can not find out. Or is there any malloc library that is suitable for allocating large blocks?

0

There are 0 answers