So, I've been spending some time implementing a memory pool class in C++. Except for some minor problems along the way, it's gone fairly well. However, when I tried testing it today by allocating 1000 chunks by first using the memory pool and then comparing it to using new, I was actually getting close too three times worse performance (in nano seconds) when using the memory pool. My allocation method looks like this:
template <class T> T* MemPool<T>::allocate()
{
Chunk<T>* tempChunk = _startChunk;
while (tempChunk->_free == false)
{
if (tempChunk->_nextChunk == NULL)
throw std::runtime_error("No available chunks");
tempChunk = tempChunk->_nextChunk;
}
tempChunk->_free = false;
return &tempChunk->object;
}
I am starting at the first chunk in the pool and doing a search through the pool's linked list until I find a free chunk, or reach the end of the pool. Now, the bigger the pool, the longer this will take as the search has an O(n) time complexity where n is the number of chunks in the pool.
So I was curious as to if anyone have any thoughts on how to improve the allocation? My initial thought was to use two linked lists instead of just the one, where one contains free chunks and the other allocated chunks. When a new chunk is to be allocated, I would simply take the first element in the first mentioned linked list and move it to the allocated linked list. As far as I can see, this would eliminate the need to do any searching when allocating, and leave only deallocating requiring a search to find the correct chunk.
Any thoughts are appreciated as this is my first time working directly with memory in this way. Thanks!
Instead of using a hand-crafted linked list, it would probably be more effective to use a
std::list
(particularly if you use it with a custom allocator). Less error prone, and probably better optimised.Using two lists will allows simplifying a lot. No need to track, in the list itself, if a chunk is free or not - since that will be specified by which list the chunk is in (all that is needed is to ensure a chunk somehow doesn't appear in both lists).
Your current implementation means you are having to walk the linked list, both when allocating and deallocating.
If the chunks are fixed size, then allocation would simply be implemented by moving the first available chunk from the free to the allocated list - no need to search. To deallocate a chunk, you would still need to find it in the allocated list, which means you would need to map a
T*
to an entry in the list (e.g. perform a search), but then the act of deallocation will be simply moving the entry from one list to the other.If the chunks are variable size, you'll need to do a bit more work. Allocating would require finding a chunk that is at least the requested size when allocating. Overallocating (allocating a larger chunk than needed) would make allocation and deallocation more efficent in terms of performance, but also mean that fewer chunks can be allocated from the pool. Alternatively, break a large chunk (from the free list) in two, and place one entry on both lists (representing the allocated part, and the part left unallocated). If you do this, when deallocating, it may be desirable to merge chunks that are adjacent in memory (effectively, implement defragmentation of the free memory in the pool).
You will need to decide whether the pool can be used from multiple threads, and use appropriate synchronisation.