Memory release in pthread TLS destructor is not detected by valgrind/massif

239 views Asked by At

Symptoms: I allocate TLS key with a destructor, create a bundle of threads and pass the TLS key to each thread. Each thread allocates memory and sets its pointer in TLS, the TLS destructor deallocates memory. I wait for threads to finish before app exits. The app is run under valgrind/massif that reports this memory not deallocated.

int main(int argc, char **argv)
{
  pthread_key_t* key = new pthread_key_t();
  pthread_key_create(key, my_destructor);

  pthread_t threads[32000];

  for(int i=0; i<32000; ++i)
    pthread_create(&threads[i], NULL, my_thread, key);

  for(int i=0; i<32000; ++i)
    pthread_join(threads[i], NULL);

  return 0;
}

In the thread runner I allocate the memory and set it up in the TLS:

extern "C" void* my_thread(void* p)
{
  pthread_setspecific(*(pthread_key_t*)p, malloc(100));

  return NULL;
}

In the TLS destructor, I release the memory:

extern "C" void my_destructor(void *p)
{
  free(p);
}

I run this under valgrind/massif 3.19 with the following options:

  --tool=massif
  --heap=yes
  --pages-as-heap=yes
  --log-file=/tmp/my.log
  --massif-out-file=/tmp/my.massif.log

Then I run ms_print /tmp/my.massif.log. I am getting the leaks reported like the following:

| ->01.75% (67,108,864B) 0x76F92D0: new_heap (in /usr/lib64/libc-2.17.so)
| | ->01.75% (67,108,864B) 0x76F98D3: arena_get2.isra.3 (in /usr/lib64/libc-2.17.so)
| |   ->01.75% (67,108,864B) 0x76FF77D: malloc (in /usr/lib64/libc-2.17.so)
| |     ->01.75% (67,108,864B) 0x410300: my_thread (threadsT.cpp:136)
| |       ...
| |       <skipped by author>
| |       ...
| |             
| ->00.00% (73,728B) in 1+ places, all below ms_print's threshold (01.00%)

...while I would not expect anything reported leaked at all.

I added the instrumentation to my_destructor and manually verified that:

  • it is invoked, indeed
  • it deallocates the memory, as it is supposed to do

Is there something apparent I am doing wrong here that makes valgrind/massif report these? Is it a valgrind/massif limitation that it cannot detect the memory deallocation when invoked from TLS destructors?

Building and running that with gcc 4.9.4 on Red Hat Enterprise Linux Server release 7.9 (Maipo).

2

There are 2 answers

4
Paul Floyd On BEST ANSWER

A second answer, this time concentrating on the 'leak' aspect.

Massif isn't really a leak detector. It's for profiling heap use.

If I compile the example (with 320 threads) then at the end I get about 89 million bytes still allocated. That is made up of

75% the arena used by malloc called from start_thread
9% pthread_create
15% loading shared libraries

None of that looks like much of a concern to me. I assume that the start_thread memory is the pthread stack cache.

If I use massif for profiling malloc/new, then the last sample is

  n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
73      2,929,610            2,360            2,308            52            0
1
Paul Floyd On

You should check the return status for your thread creation. It's unlikely that you are succeeding in creating 32000 threads.

A bit of Valgrind source:

coregrind/pub_core_options.h:#define MAX_THREADS_DEFAULT 500
coregrind/m_scheduler/scheduler.c:   VG_(printf)("Use --max-threads=INT to specify a larger number of threads\n"

Assuming that this is amd64 Linux, I believe that the default pthread stack size is 8Mbytes. That means you need 256Gbytes for stack memory. Does your machine have that much?

Please try the following

  1. Use pthread_attr_setstacksize to set the stack sizes to PTHREAD_STACK_MIN (16k).
  2. Run Valgrind with --max-threads=32001

Even with the above you may still hit some Valgrind limits such as VG_N_SEGMENTS.

If you see a message like

"Valgrind: FATAL: VG_N_SEGMENTS is too low. Increase it and rebuild.
Exiting now."

Then you will need to rebuild Valgrind with an increased limit.