I was thinking about using very basic bounded (preallocated) stack to keep track of my threads ids in correct LIFO order. So i was wondering if my implementation is thread safe:
// we use maximum 8 workers
size_t idle_ids_stack[8];
// position current write happening at
std::atomic_uint_fast8_t idle_pos(0);
// this function is called by each thread when it is about to sleep
void register_idle(size_t thread_id)
{
std::atomic_thread_fence(std::memory_order_release);
idle_ids_stack[idle_pos.fetch_add(1, std::memory_order_relaxed)] = thread_id;
}
// this function can be called from anywhere at anytime
void wakeup_one()
{
uint_fast8_t old_pos(idle_pos.load(std::memory_order_relaxed));
std::atomic_thread_fence(std::memory_order_acquire);
size_t id;
do
{
if(old_pos == 0) return; // no idle threads in stack; exit;
id = idle_ids_stack[old_pos-1];
}
while (!idle_pos.compare_exchange_weak(old_pos, old_pos-1, std::memory_order_acquire, std::memory_order_relaxed));
// wakeup single thread
signal_thread(id);
}
I'm not an expert in lock free programming, but I'm pretty sure that your code is NOT thread safe.
Let's first look at
register_idle()
:What could happen here is that Thread1 increments
idle_pos
but before it stores its id, another thread callswakeup_once
and uses an outdated id (in the worst case even an invalid on, because the array is not yet initialized). Also I don't see the reason for the memory fence.In
wakeup_one()
you have a similar Problem (called ABA problem) :idle_pos
and accordingid
.wakeup_one
(idle_pos is decreased).register_idle
, which increases idle_pos again to the same value as before.idle_pos
is unchanged and signals the wrong threadI might be mistaken, but I believe it is in general not possible to create completely lock free stack based on an array, because you have to do two things in a single atomic operation: modify the index variable and store or load the value in the array.
Aside from those logic errors, I'd highly recommend not to use stand alone memory fences (they make the code less readable and might even be more costly). Also I would only start to manually specify memory orders, after I made sure that the program is correct with the default ones.