I'm having a dead-lock when trying to notify a condition_variable from a thread.
Here is my MCVE:
#include <iostream>
#include <boost/thread.hpp>
#include <boost/thread/mutex.hpp>
#include <boost/thread/condition_variable.hpp>
static boost::mutex m_mutex;
static boost::condition_variable m_cond;
void threadFunc()
{
std::cout << "LOCKING MUTEX" << std::endl;
boost::mutex::scoped_lock lock( m_mutex );
std::cout << "LOCKED, NOTIFYING CONDITION" << std::endl;
m_cond.notify_all();
std::cout << "NOTIFIED" << std::endl;
}
int main( int argc, char* argv[] )
{
while( true )
{
std::cout << "TESTING!!!" << std::endl;
boost::mutex::scoped_lock lock( m_mutex );
boost::thread thrd( &threadFunc );
//m_cond.wait( lock );
while ( !m_cond.timed_wait(lock,boost::posix_time::milliseconds(1)) )
{
std::cout << "WAITING..." << std::endl;
}
static int pos = 0;
std::cout << "DONE!!! " << pos++ << std::endl;
thrd.join();
}
return 0;
}
If using m_cond.wait( lock );, I see DONE!!! being written for every attempt, no problem here.
If I use the while ( !m_cond.timed_wait(lock,boost::posix_time::milliseconds(1)) ) loop, I see DONE!!! being written for a few attempts, and, at some point, I get a dead lock and waiting finally never ends:
TESTING!!!
LOCKING MUTEX
LOCKED, NOTIFYING CONDITION
NOTIFIED
WAITING...
WAITING...
WAITING...
WAITING...
WAITING...
WAITING...
...
I have read other posts on stackoverflow (like Condition variable deadlock): they mention that this could happen if notify_all is called before condition's wait function is running, so the mutex must be used to prevent that. But I feel like that's what I'm doing:
- I lock the mutex before creating the thread
- Then thread cannot notify before
m_cond.timed_waitis reached (and then mutex is unlocked) - Within the loop, in case of timeout,
timed_waitrelocks the mutex so notify cannot be done, we print "WITTING..." and we release the mutex when we are again ready to receive the notification
So why is the dead-lock occuring? Could the condition be notified between the moment when timed_wait detects the timeout and relock the mutex?
The problem is that if
timed_waitcompletes beforenotify_allis called it will then have to wait for the thread to release the mutex (i.e. after it has callednotify_all) before it resumes then will calltimed_waitagain, the thread has finished sotimed_waitwill never succeed. There are two scenarios where this can happen, if your thread takes more than a millisecond to start (should be unlikely but the scheduling vagaries of your OS mean it could happen, especially if the CPU is busy) the other is spurious wakeups.Both scenarios can be guarded against by setting a flag when calling
notify_allwhich the waiting thread can check to ensure notify has been called: