Is deadlock possible in this simple scenario?

373 views Asked by At

Please see the following code:

std::mutex mutex;
std::condition_variable cv;
std::atomic<bool> terminate;

// Worker thread routine
void work() {
    while( !terminate ) {
        {
            std::unique_lock<std::mutex> lg{ mutex };
            cv.wait(lg);

            // Do something
        }
        // Do something
    }
}

// This function is called from the main thread
void terminate_worker() {
    terminate = true;
    cv.notify_all();
    worker_thread.join();
}

Is the following scenario can happen?

  1. Worker thread is waiting for signals.
  2. The main thread called terminate_worker();
    • The main thread set the atomic variable terminate to true, and then signaled to the worker thread.
    • Worker thread now wakes up, do its job and load from terminate. At this step, the change to terminate made by the main thread is not yet seen, so the worker thread decides to wait for another signal.
  3. Now deadlock occurs...

I wonder this is ever possible. As I understood, std::atomic only guarantees no race condition, but memory order is a different thing. Questions:

  1. Is this possible?
  2. If this is not possible, is this possible if terminate is not an atomic variable but is simply bool? Or atomicity has nothing to do with this?
  3. If this is possible, what should I do?

Thank you.

1

There are 1 answers

3
MikeMB On BEST ANSWER

I don't believe, what you describe is possible, as cv.notify_all() afaik (please correct me if I'm wrong) synchronizes with wait(), so when the worker thread awakes, it will see the change to terminate.

However:

A deadlock can happen the following way:

  1. Worker thread (WT) determines that the terminate flag is still false.

  2. The main thread (MT) sets the terminate flag and calls cv.notify_all().

  3. As no one is curently waiting for the condition variable that notification gets "lost/ignored".
  4. MT calls join and blocks.
  5. WT goes to sleep ( cv.wait()) and blocks too.

Solution:

While you don't have to hold a lock while you call cv.notify, you

  • have to hold a lock, while you are modifying terminate (even if it is an atomic)
  • have to make sure, that the check for the condition and the actual call to wait happen while you are holding the same lock.

This is why there is a form of wait that performs this check just before it sends the thread to sleep.

A corrected code (with minimal changes) could look like this:

// Worker thread routine
void work() {
    while( !terminate ) {
        {
            std::unique_lock<std::mutex> lg{ mutex };
            if (!terminate) {
                cv.wait(lg);
            }

            // Do something
        }
        // Do something
    }
}

// This function is called from the main thread
void terminate_worker() {
    {
        std::lock_guard<std::mutex> lg(mutex);
        terminate = true;
    }
    cv.notify_all();
    worker_thread.join();
}