malloc function interposition in the standard C and C++ libraries

1.8k views Asked by At

I want to write a shared library in such a way that it is possible to isolate it’s memory usage from the application it is linked against. That is, if the shared library, let’s call it libmemory.so, calls malloc, I want to maintain that memory in a separate heap from the heap that is used to service calls to malloc made in the application. This question isn't about writing memory allocators, it more about linking and loading the library and application together.

So far I’ve been experimenting with combinations of function interposition, symbol visibility, and linking tricks. So far, I can’t get this right because of one thing: the standard library. I cannot find a way to distinguish between calls to the standard library that internally use malloc that happen in libmemory.so versus the application. This causes an issue since then any standard library usage within libmemory.so pollutes the application heap.

My current strategy is to interpose definitions of malloc in the shared library as a hidden symbol. This works nicely and all of the library code works as expected except, of course, the standard library which is loaded dynamically at runtime. Naturally, I’ve been trying to find a way to statically embed the standard library usage so that it would use the interposed malloc in libmemory.so at compile time. I’ve tried -static-libgcc and -static-libstdc++ without success (and anyway, it seems this is discouraged). Is this the right answer?

What do?

P.s., further reading is always appreciated, and help on the question tagging front would be nice.

1

There are 1 answers

3
Employed Russian On

I’ve tried -static-libgcc and -static-libstdc++ without success

Of course this wouldn't succeed: malloc doesn't live in libgcc or libstdc++; it lives in libc.

What you want to do is statically link libmemory.so with some alternative malloc implementation, such as tcmalloc or jemalloc, and hide all malloc symbols. Then your library and your application will have absolutely separate heaps.

It goes without saying that you must never allocate something in your library and free it in the application, or vice versa.

In theory you could also link the malloc part of system libc.a into your library, but in practice GLIBC (and most other UNIX C libraries) does not support partially-static link (if you link libc.a, you must not link libc.so).

Update:

If libmemory.so makes use of a standard library function, e.g., gmtime_r, which is linked in dynamically, thereby resolving malloc at runtime, then libmemory.so mistakenly uses malloc provided at runtime (the one apparently from glibc

There is nothing mistaken about that. Since you've hidden your malloc inside your library, there is no other malloc that gmtime_r could use.

Also, gmtime_r doesn't allocate memory, except for internal use by GLIBC itself, and such memory could be cleaned up by __libc_freeres, so it would be wrong to allocate this memory anywhere other than using GLIBC's malloc.

Now, fopen is another example you used, and fopen does malloc memory. Apparently you would like fopen to call your malloc (even though it's not visible to fopen) when called by your library, but call system malloc when called by the application. But how can fopen know who called it? Surely you are not suggesting that fopen walk the stack to figure out whether it was called by your library or by something else?

So, if you really want to make your library never call into system malloc, then you would have to statically link all other libc functions that you use and that may call malloc (and hide them in your library as well).

You could use something like uclibc or dietlibc to achieve that.