Searching for a safe magic number for in-memory data structure

2.5k views Asked by At


I'm implementing a heap allocator (malloc), and I need to choose a magic number to check if a given pointer point to a data structure I allocated. It seems obvious to me that no magic number can be considered completely safe (if the number is checked, I can be sure a point to one of my data structure), but maybe I missed something, so... if someone can help and bring me the number of my dreams, I'd really appreciate. Thx in advance.

4

There are 4 answers

0
derobert On BEST ANSWER

It depends on what you're doing this for. If you're doing it to try and catch programming mistakes (e.g., you want to make sure you don't accidentally mix up my_malloc/my_free and malloc/free), then just pick a random value. Sure, sometimes it'll fail to detect such a case, but that really doesn't matter. It shouldn't ever happen. So, here:

#define MAGIC_32BIT 0x77A5844CU
#define MAGIC_64BIT 0xD221A6BE96E04673UL

If correctness depends on this, then you really ought to do this another way. For example, by keeping track of which addresses you've allocated in a hash or tree or, in special cases, a bitmap.

If you're actually implementing malloc/free (e.g., writing your own C library), then keep in mind that freeing something that wasn't malloced (except NULL) is undefined behavior by the standard, so your code doesn't need to worry what happens.

0
Giuliano On

I haven't done such a thing (I worked with heaps but I did notimplemented any allocator) and I'm not certain of what you're trying to do, but maybe this is a case you should use hashing.

Depending on what exactly you're doing, it means to hash the address of the memory chunk, or the data it contains (and every time you change something, this implies recalculating the hash) or some kind of ID of the memory operation.

Again, I am not certain of what you are trying to achieve, then those are my 2 cents.

0
rmk On

TALLOC_MAGIC 0xe814ec70

This is from the file talloc.c in the source code here. Of course, you will have to look at why talloc chose this magic number, but it's a start.

0
EmeryBerger On

Rather than picking a single magic number, you should use a random number (preferably with at least one of the lower 8 bits set -- you can force this by ORing in 1, for instance) or some constant -- your choice, and then XOR it (^) with an address (e.g., the address you are checking). That approach will dramatically reduce the odds of an accidental collision.

For example, when you write the object header (or page header, depending on the kind of allocator you are writing), store MAGIC ^ addr. Now when you want to check if addr is valid, just see if value == addr ^ MAGIC (with appropriate casts, of course).

By the way, before embarking on creating your own custom memory allocator, please read this paper (Reconsidering Custom Memory Allocation, by Berger, Zorn and McKinley), from OOPSLA 2002.

http://www.cs.umass.edu/~emery/pubs/berger-oopsla2002.pdf

Abstract: Programmers hoping to achieve performance improvements often use custom memory allocators. This in-depth study examines eight applications that use custom allocators. Surprisingly, for six of these applications, a state-of-the-art general-purpose allocator (the Lea allocator) performs as well as or better than the custom allocators. The two exceptions use regions, which deliver higher performance (improvements of up to 44%). Regions also reduce programmer burden and eliminate a source of memory leaks. However, we show that the inability of programmers to free individual objects within regions can lead to a substantial increase in memory consumption. Worse, this limitation precludes the use of regions for common programming idioms, reducing their usefulness. We present a generalization of general-purpose and region-based allocators that we call reaps. Reaps are a combination of regions and heaps, providing a full range of region semantics with the addition of individual object deletion. We show that our implementation of reaps provides high performance, outperforming other allocators with region-like semantics. We then use a case study to demonstrate the space advantages and software engineering beneļ¬ts of reaps in practice. Our results indicate that programmers needing fast regions should use reaps, and that most programmers considering custom allocators should instead use the Lea allocator.