I have a created shared library which interposes malloc() and related calls. The works well but for some caveats. There is one thing that does not work. I am expecting to be able to chain interposers such that I can run something like
LD_PRELOAD="/path/to/mymalloc.so /usr/lib64/jemalloc.so" some_app
The intention is that instead of forwarding to libc malloc() my library should now forward to jemalloc via RTLD_NEXT.
However it segfaults generating stack trace showing my malloc wrapper calling itself ad infinitum. Though it does not allocate any memory itself when jemalloc is not in use:
#224364 0x00007facb1aef46a in Memory::HybridAllocator<Memory::LibCAllocator, Memory::StaticAllocator>::malloc (this=0x7facb1d0be60 <Memory::getHybridAllocator()::hybrid>, size=72704) at /home/brucea/work/git/libbede/src/main/cpp/memory/Memory/HybridAllocator.h:109
#224365 0x00007facb1aefa8a in malloc (size=72704) at /home/brucea/work/git/libbede/src/main/cpp/memory/Memory/mallocwrap.cpp:11
#224366 0x00007facb1aeeca2 in Memory::LibCAllocator::malloc (this=0x7facb1cf3720 <Memory::getBootstrapAllocator()::bootstrap>, requestSize=72704) at /home/brucea/work/git/libbede/src/main/cpp/memory/Memory/LibCAllocator.h:77
#224367 0x00007facb1aef46a in Memory::HybridAllocator<Memory::LibCAllocator, Memory::StaticAllocator>::malloc (this=0x7facb1d0be60 <Memory::getHybridAllocator()::hybrid>, size=72704) at /home/brucea/work/git/libbede/src/main/cpp/memory/Memory/HybridAllocator.h:109
#224368 0x00007facb1aefa8a in malloc (size=72704) at /home/brucea/work/git/libbede/src/main/cpp/memory/Memory/mallocwrap.cpp:11
#224369 0x00007facb133fc1a in (anonymous namespace)::pool::pool (this=0x7facb163e200 <(anonymous namespace)::emergency_pool>) at ../../../../libstdc++-v3/libsupc++/eh_alloc.cc:123
#224370 __static_initialization_and_destruction_0 (__priority=65535, __initialize_p=1) at ../../../../libstdc++-v3/libsupc++/eh_alloc.cc:262
#224371 _GLOBAL__sub_I_eh_alloc.cc(void) () at ../../../../libstdc++-v3/libsupc++/eh_alloc.cc:338
#224372 0x00007facb1d1b8ba in call_init (l=<optimized out>, argc=argc@entry=4, argv=argv@entry=0x7ffe3ba440e8, env=env@entry=0x7ffe3ba44110) at dl-init.c:72
#224373 0x00007facb1d1b9ba in call_init (env=0x7ffe3ba44110, argv=0x7ffe3ba440e8, argc=4, l=<optimized out>) at dl-init.c:30
#224374 _dl_init (main_map=0x7facb1f3a1d0, argc=4, argv=0x7ffe3ba440e8, env=0x7ffe3ba44110) at dl-init.c:119
#224375 0x00007facb1d0cfda in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
#224376 0x0000000000000004 in ?? ()
#224377 0x00007ffe3ba45d7f in ?? ()
#224378 0x00007ffe3ba45ddd in ?? ()
#224379 0x00007ffe3ba45de0 in ?? ()
#224380 0x00007ffe3ba45de4 in ?? ()
#224381 0x0000000000000000 in ?? ()
Debugging in gdb the cause seems to be that malloc_hook inside __libc_malloc() is somehow set to point at my implementation of malloc resulting in an infinite recursion. But it must be jemalloc doing this somehow.
__GI___libc_malloc (bytes=16) at malloc.c:3037
3037 {
(gdb) s
3042 = atomic_forced_read (__malloc_hook);
(gdb) s
3043 if (__builtin_expect (hook != NULL, 0))
(gdb) s
3044 return (*hook)(bytes, RETURN_ADDRESS (0));
(gdb) s
malloc (size=140737488345424) at /home/brucea/work/git/libbede/src/main/cpp/memory/Memory/mallocwrap.cpp:12
The basic outline is my code (in C++ except for the low-level parts so apologies for any offence caused to C purists):
extern "C" void* malloc(const size_t size) __THROW
{
return getMyAllocator().malloc(size);
}
// etc. for free() et al
// elsewhere
auto wrap(const char* sym)
{
static void* libchandle = nullptr;
auto f = dlsym(RTLD_NEXT,sym);
if (f == nullptr)
{
std::fprintf(stderr, "error: unable to find symbol via dlsym(RTLD_NEXT,%s):\n",sym);
std::fprintf(stderr, "%s\n",dlerror());
f = dlsym(RTLD_DEFAULT, sym);
}
if (f == nullptr)
{
std::fprintf(stderr, "error: unable to find symbol via dlsym(RTLD_DEFAULT,%s):\n",sym);
std::fprintf(stderr, "%s\n",dlerror());
if (libchandle == nullptr)
{
libchandle = dlopen("libc.so", RTLD_LAZY);
if (libchandle == nullptr)
{ \
std::fprintf(stderr, "unable to open libc.so:\n");
std::fprintf(stderr, "%s\n",dlerror());
}
if (libchandle != nullptr)
{
f = dlsym(libchandle, sym);
}
}
if (f == nullptr)
{
std::fprintf(stderr, "error: unable to find symbol via dlsym(\"libc\",%s):\n",sym);
std::fprintf(stderr, "%s\n",dlerror());
std::exit(1);
}
}
return f;
}
#define WRAP(X) \
{ \
static constexpr const char* const symName = #X; \
auto f = reinterpret_cast<decltype(&::X)>(wrap(#X)); \
this->X##Func = f; \
}
// Note: until ForwardingAllocator is setup
// malloc() etc are forwarded to __libc_malloc() etc
ForwardingAllocator::ForwardingAllocator()
{
WRAP(malloc)
WRAP(free)
WRAP(calloc)
WRAP(realloc)
WRAP(malloc_usable_size)
}
Lots of stuff omitted for brevity.
Are there any suggestions as to what I might be doing wrong or how I can better diagnose the issue?
It seems that jemalloc itself defines __libc_malloc
>nm /usr/lib/debug/usr/lib64/libjemalloc.so.2-5.2.1-2.el8.x86_64.debug | grep __libc_malloc
000000000000d4f0 t __libc_malloc
Some further information.
- malloc_hooks are deprecated so I don't use them.
Complications I have handled with some success:
dlsym() uses malloc() - I use a simple bootstrap allocator during startup before switching to the main one which forwards to libc's malloc()
I originally used a naive allocator as a booststrap allocator
My wrapper to free() delegates to the appropriate free() depending on which malloc() was in use
I have now moved to using __libc_malloc as a the bootstrap allocator but allowing it to be replaced via dlsym as soon as possible.
This is a useful answer - https://stackoverflow.com/a/17850402/1569204
Though
jemalloc
provides__libc_malloc
as a symbol it is for use for static linking with glibc only.when you forward to
__libc_malloc
in your shared library you are still forwarding to thelibc
implementation. However, it seems that during startupjemalloc
sets malloc hooks to point to the previous address ofmalloc()
. In this case the malloc wrapper in the first library (i.e. yours). After setting a couple of things up internally which currently requires 3 calls tomalloc()
jemalloc installs itself as the newmalloc
via thelibc
malloc hooks.Unfortunately there is no other symbol exported by
glibc
that you can use to bypass malloc hooks and use malloc directly. At least on the version I'm using.You could handle this by setting malloc hooks yourself if you have another malloc replacement to use. However, you have already expressed a desire to "do the right thing" and not use malloc hooks because they are deprecated
You can handle this without using malloc hooks by detecting recursive calls and providing a path to some other malloc for example:
This is ugly but it works. Your wrapper to malloc is less efficient to the tune of one increment, one decrement and one conditional branch. It probably doesn't make any difference but you could use the C++ attribute [[unlikely]] or gcc's __builtin_expect to say that the branch for recursion is not likely to be taken.
There is another pitfall to be aware of. If you are forwarding multiple symbols you should check that they are all forwarded safely (typically this means to the same library). For example:
An example of this in practice is electric-fence. If I chain:
LD_PRELOAD="mymalloc.so electric-fence.so"
You find that
malloc_usable_size()
comes fromlibc
whilemalloc
comes fromelectric-fence
. Granted electric-fence is not so common any more.In this case it would be safer to replace malloc_usable_size() with a dummy function that always returns 0. For example the normal libc version of malloc_usable_size(ptr) - (see https://code.woboq.org/userspace/glibc/malloc/malloc.c.html) looks at pointers located just before the allocated block (i.e. ptr-2*sizeof(size_t) ). If you give it a ptr that does not conform to this pattern it could segfault.
See for example Is it possible to define a symbol dynamically such that it will be found by dlsym?