Thread Starvation vs Contention?

1.6k views Asked by At

According to Oracle docs about "thread starvation",

Starvation describes a situation where a thread is unable to gain regular access to shared resources and is unable to make progress. This happens when shared resources are made unavailable for long periods by "greedy" threads.

And on Oracle docs about thread "thread contention",

If a thread tries to take a lock that is already held by another thread, then it must wait until the lock is released. When this happens, there is so called “contention” for the lock.

Actually, the above definitions make some sense, I could not get a clear definition to contrast them. Can anyone please explain the differences between the above terms and the relationship between them, if any?

Note: By referring some answers of What is thread contention?, I felt that aren't those answers applicable to "thread starvation" too?

1

There are 1 answers

0
Solomon Slow On

"Starvation" refers to a specific form of contention in which one (or a few) certain thread(s) get locked out more often than other threads. It happens when the threads interact in some way such that every time the "starved" thread tries to acquire some lock, it's always just moments after some other thread already has locked it.


One common newbie mistake, of which you can find many examples on this site looks like this:

while (trivialTest()) {
    synchronized (lock) {
         processThatTakesSomeTime();
    }
}

The problem here is that almost the very next thing that a thread does after releasing the lock is, it locks it again. Any other thread that wants to lock the same lock will effectively have "gone to sleep" waiting for the first thread to finish the processThatTakesSomeTime() call.

When the first thread unlocks the lock, it's a race to see which thread can lock it again, but the first thread has a huge advantage because it's already running, and the others, which may have been "swapped out" by the operating system, always lose the race.


One way to fix the "newbie mistake" is to use so-called "optimistic locking:"

while (trivialTest()) {
    Snapshot snapshot;
    boolean success=false;
    while (! success) {
        synchronized (lock) {
            snapshot = takeSnapshotOfSharedData();
        }
        Result result = doSomeComputationWith(snapshot);
        synchronized (lock) {
            if (itStillMakesSenseToPublishResult(snapshot, result)) {
            publishToSharedData(result);
            success = true;
            }
        }
    }
}

It's "optimistic" because we hope the the inner loop will find success each time. But sometimes, that won't be the case, because some other thread will have changed the shared data in a way that is incompatible with our result. We have to throw the work away and try again in that case.

It may seem counterintuitive, but the performance gained by only keeping the lock locked just long enough to take a snapshot or to verify and publish the result can be much greater than the performance lost by occasionally having to throw work away and try again.