I'm implementing a heap allocator (malloc), and I need to choose a magic number to check if a given pointer point to a data structure I allocated. It seems obvious to me that no magic number can be considered completely safe (if the number is checked, I can be sure a point to one of my data structure), but maybe I missed something, so... if someone can help and bring me the number of my dreams, I'd really appreciate.
Thx in advance.
Searching for a safe magic number for in-memory data structure
2.6k views Asked by fokenrute AtThere are 4 answers
I haven't done such a thing (I worked with heaps but I did notimplemented any allocator) and I'm not certain of what you're trying to do, but maybe this is a case you should use hashing.
Depending on what exactly you're doing, it means to hash the address of the memory chunk, or the data it contains (and every time you change something, this implies recalculating the hash) or some kind of ID of the memory operation.
Again, I am not certain of what you are trying to achieve, then those are my 2 cents.
TALLOC_MAGIC 0xe814ec70
This is from the file talloc.c
in the source code here.
Of course, you will have to look at why talloc chose this magic number, but it's a start.
Rather than picking a single magic number, you should use a random number (preferably with at least one of the lower 8 bits set -- you can force this by ORing in 1, for instance) or some constant -- your choice, and then XOR it (^) with an address (e.g., the address you are checking). That approach will dramatically reduce the odds of an accidental collision.
For example, when you write the object header (or page header, depending on the kind of allocator you are writing), store MAGIC ^ addr
. Now when you want to check if addr
is valid, just see if value == addr ^ MAGIC
(with appropriate casts, of course).
By the way, before embarking on creating your own custom memory allocator, please read this paper (Reconsidering Custom Memory Allocation, by Berger, Zorn and McKinley), from OOPSLA 2002.
http://www.cs.umass.edu/~emery/pubs/berger-oopsla2002.pdf
Abstract: Programmers hoping to achieve performance improvements often use custom memory allocators. This in-depth study examines eight applications that use custom allocators. Surprisingly, for six of these applications, a state-of-the-art general-purpose allocator (the Lea allocator) performs as well as or better than the custom allocators. The two exceptions use regions, which deliver higher performance (improvements of up to 44%). Regions also reduce programmer burden and eliminate a source of memory leaks. However, we show that the inability of programmers to free individual objects within regions can lead to a substantial increase in memory consumption. Worse, this limitation precludes the use of regions for common programming idioms, reducing their usefulness. We present a generalization of general-purpose and region-based allocators that we call reaps. Reaps are a combination of regions and heaps, providing a full range of region semantics with the addition of individual object deletion. We show that our implementation of reaps provides high performance, outperforming other allocators with region-like semantics. We then use a case study to demonstrate the space advantages and software engineering beneļ¬ts of reaps in practice. Our results indicate that programmers needing fast regions should use reaps, and that most programmers considering custom allocators should instead use the Lea allocator.
It depends on what you're doing this for. If you're doing it to try and catch programming mistakes (e.g., you want to make sure you don't accidentally mix up
my_malloc
/my_free
andmalloc
/free
), then just pick a random value. Sure, sometimes it'll fail to detect such a case, but that really doesn't matter. It shouldn't ever happen. So, here:If correctness depends on this, then you really ought to do this another way. For example, by keeping track of which addresses you've allocated in a hash or tree or, in special cases, a bitmap.
If you're actually implementing malloc/free (e.g., writing your own C library), then keep in mind that
free
ing something that wasn'tmalloc
ed (except NULL) is undefined behavior by the standard, so your code doesn't need to worry what happens.