I'm experimenting with threads parking and decided to build some sort of service. Here is how it looks like:
public class TestService {
private static final Logger logger = LoggerFactory.getLogger(TestService.class); // logback I think this logger causes some troubles
private final CountDownLatch stopLatch;
private final Object parkBlocker = new Object();
private volatile boolean stopped;
private final Thread[] workers;
public TestService(int parallelizm) {
stopLatch = new CountDownLatch(parallelizm);
workers = new Thread[parallelizm];
for (int i = 0; i < parallelizm; i++) {
workers[i] = new Thread(() -> {
try {
while (!stopped) {
logger.debug("Parking " + Thread.currentThread().getName());
LockSupport.park(parkBlocker);
logger.debug(Thread.currentThread().getName() + " unparked");
}
} finally {
stopLatch.countDown();
}
});
}
}
public void start() {
Arrays.stream(workers).forEach(t -> {
t.start();
logger.debug(t.getName() + " started");
});
}
public boolean stop(long timeout, TimeUnit unit) throws InterruptedException {
boolean stoppedSuccefully = false;
this.stopped = true;
unparkWorkers();
if (stopLatch.await(timeout, unit)) {
stoppedSuccefully = true;
}
return stoppedSuccefully;
}
private void unparkWorkers() {
Arrays.stream(workers).forEach(w -> {
LockSupport.unpark(w);
logger.debug("Un-park call is done on " + w.getName());
});
}
}
The issue I faced with was that if I then test this service as follows:
public static void main(String[] args) = {
while(true) {
TestService service = new TestService(2);
service.start();
if (!service.stop(10000, TimeUnit.MILLISECONDS))
throw new RuntimeException();
}
}
I sometimes got the following behavior:
14:58:55.226 [main] DEBUG com.pack.age.TestService - Thread-648 started
14:58:55.227 [Thread-648] DEBUG com.pack.age.TestService - Parking Thread-648
14:58:55.227 [main] DEBUG com.pack.age.TestService - Thread-649 started
14:58:55.227 [main] DEBUG com.pack.age.TestService - Un-park call is done on Thread-648
14:58:55.227 [Thread-648] DEBUG com.pack.age.TestService - Thread-648 unparked
14:58:55.227 [main] DEBUG com.pack.age.TestService - Un-park call is done on Thread-649
14:58:55.227 [Thread-649] DEBUG com.pack.age.TestService - Parking Thread-649
Exception in thread "main" java.lang.RuntimeException
at com.pack.age.Test$.main(Test.scala:12)
at com.pack.age.Test.main(Test.scala)
The thread is hanging out on parking:
"Thread-649" #659 prio=5 os_prio=0 tid=0x00007efe4433f000 nid=0x7691 waiting on condition [0x00007efe211c8000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000720739a68> (a java.lang.Object)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at com.pack.age.TestService.lambda$new$0(TestService.java:27)
at com.pack.age.TestService$$Lambda$1/1327763628.run(Unknown Source)
at java.lang.Thread.run(Thread.java:748)
I don't see any race in park-unpark in the service. Moreover if the unpark is called before park, the park is guaranteed no to block (that's what javadocs say).
Maybe I misuse LockSupport::park. Can you suggest any fix?
This has nothing to do with logger, though it's usage brings the problem to the surface. You have a race condition, as simple as that. Before explaining that race condition you need to understand a few things from
LockSupport::unparkdocumentation first:The first point is explain here. The short version is : if you have a
threadthat has already been started, but has not yet calledpark, and within this period of time (between thestartof the thread andpark), some other thread callsunparkon the first one : that thread will not park, at all. The permit will be available immediately. May be this little drawing will make it more clear:Notice how
ThreadBcallsunpark(ThreadA)between the period whereThreadAhas calledstartandpark. As such, whenThreadAreachespark: it is guaranteed not to block, exactly like the documentation says.The second point from the same documentation is:
Let's see that via a drawing:
After
ThreadAcallspark, it will hang forever, sinceThreadBnever callsunparkon it again. Notice that the call tounparkwas made beforeThreadAhas started (unlike the previous example).And this is exactly what happens in your case:
LockSupport.unpark(w);(fromunparkWorkers) is called beforet.start();frompublic void start(){...}. In simpler words - your code callsunparkon bothworkersbefore they even start, as such when they ultimately reachpark- they are stuck, no one is able tounparkthem. The fact that you see this with aloggerand not withSystem::outhas most probably to do with the face that when youprintln- there is asynchronizedmethod under the hood.As a matter of fact,
LockSupportoffers exactly the semantics needed to prove this. For this we need (for simplicity :SOProblem service = new SOProblem(1);)And now we need to insert this in proper methods. First flag that fact that we have called
unpark:Then reset the flag after a cycle has ended:
Then change the constructor to flag that the thread has started:
Then, when your thread freezes you need to inspect the flag:
where
debugmethod is:When the issue re-produces, you have called
unparkbefore you calledpark, that happens whenx = 3as the output.