What scratch buffer means in glibc?

2.5k views Asked by At

I found that below codes makes heap leak if I check it with tcmalloc heap checker with draconian mode but the leak is not found with LSan
(I assume that internal allocation in glibc is suppressed in LSan)

#include <string.h>
#include <netdb.h>

int foo() {
    struct addrinfo hints, *res;
    memset(&hints, 0, sizeof hints);

    getaddrinfo("www.example.com", 0, &hints, &res);

    freeaddrinfo(res);
}

int main() {
    foo();
}

I checked a bit more and found that getaddrinfo() uses scratch buffer in glibc internally
and suspect that those scratch buffer makes memory leaks
(even though it isn't harmful)

But sadly there isn't full explanation
and only says that "scratch buffer is variable-sized buffers with on-stack default allocation";;

What scratch buffer exactly do though?

you can refer glibc/include/scratch_buffer.h here

2

There are 2 answers

3
Florian Weimer On BEST ANSWER

Internally, all the NSS interfaces (of which getaddrinfo is one) look like gethostbyname_r:

   int gethostbyname_r(const char *name,
           struct hostent *ret, char *buf, size_t buflen,
           struct hostent **result, int *h_errnop);

The caller supplies a buffer for the result data via buf, of buflen bytes. If it turns out that this buffer is not sufficient in size, the function fails with an ERANGE error. The caller is expected to grow the buffer (reallocating it in some way) and call the function against, with the other parameters the same. This repeats until the buffer is large enough and the function succeeds (or the function fails for some other reason). It's a longer story how we ended up with this strange interface, but it's the interfaces we have today. getaddrinfo looks differently, but the internal backing implementations are very similar to the public gethostbyname_r function.

Because the retry-with-a-larger-buffer idiom is so common throughout the NSS code, struct scratch_buffer was introduced. (Previously, there was a fairly eclectic mix of fixed buffer sizes, alloca, alloca with malloc fallback, and so on.) struct scratch_buffer combines a fixed-size on-stack buffer which is used for the first NSS call. If that fails with ERANGE, scratch_buffer_grow is called, which switches to a heap buffer, and on subsequent calls, allocates a larger heap buffer. scratch_buffer_free deallocates the heap buffer if there is one.

In your example, the leaks that tcmalloc reports are not related to the scratch buffers. (We have had certainly such bugs in getaddrinfo, particularly on obscure error paths, but the current code should be mostly okay.) Link order is also not a problem because evidently, tcmalloc is active, otherwise you would not get any leak reports.

The reason why you see leaks with tcmalloc (but not with other tools such as valgrind) is that tcmalloc does not call the magic __libc_freeres function, which was specifically added for heap checkers. Normally, when the process terminates, glibc does not deallocate all internal allocations because the kernel will release that memory anyway. Most subsystems register there allocations in some way with __libc_freeres. In the getaddrinfo example, I see the following still-allocated resources:

  • Results of parsing /etc/resolv.conf (system DNS configuration).
  • Results of parsing /etc/nsswitch.conf (NSS configuration).
  • Various dynamic loader data structures resulting from internal dlopen calls (for loading the NSS services modules.
  • The cache for recording system IPv4/IPv6 support in getaddrinfo.

You can see these allocations easily if you run your example under valgrind, using a command like this one:

valgrind --leak-check=full --show-reachable=yes --run-libc-freeres=no

The key part is --run-libc-freeres=no, which instructs valgrind not to call __libc_freeres, which it does by default. If you omit this parameter, valgrind will not report any memory leaks.

0
flowit On

From the README of google-perftools:

In order to catch all heap leaks, tcmalloc must be linked last into your executable. The heap checker may mischaracterize some memory accesses in libraries listed after it on the link line. For instance, it may report these libraries as leaking memory when they're not. (See the source code for more details.)

And usually, libc is linked last.

Scratch buffer or scratch space is a term quite often used for pre-allocated memory (because startup time usally matters less than runtime performace) to be used for all kinds of stuff. I do not know the exact usage of it in glibc, but I simply assume they need a buffer for their internal computations. Instead of allocating on the fly, they just use the preallocated scratch buffer.

LSan has support for supressing some leaks, but you'd have to check yourself if and which supressions are active in your build.

As for draconian mode: I strongly suspect the scratch buffer is allocated before your main function and freed after it. In this case, HeapChecker would report it. Don't worry too much about it.