Java Virtual Threads causes application to hang on ReentrantLocks in logback

296 views Asked by At

I am experiencing a problem after migration from java 17 to java 21 with Spring Boot 3.2.1 and Tomcat 10. Virtual threads are enabled and used (e.g. I can see that tomcat spawns new virtual threads while handling requests, spring scheduled tasks are executed within virtual threads).

Problem occurs completely randomly. At some point application stop to respond to any HTTP requests including healthcheck on spring actuator. I did thread dump using jcmd and what I notice is that some threads are stuck on logback Reentrant locks. This includes virtual and non-virtual threads. Here are some examples:

#124 "org.springframework.kafka.KafkaListenerEndpointContainer#4-1-C-1"
      java.base/jdk.internal.misc.Unsafe.park(Native Method)
      java.base/java.util.concurrent.locks.LockSupport.park(LockSupport.java:221)
      java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:754)
      java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:990)
      java.base/java.util.concurrent.locks.ReentrantLock$Sync.lock(ReentrantLock.java:153)
      java.base/java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:322)
      ch.qos.logback.core.OutputStreamAppender.writeBytes(OutputStreamAppender.java:200)
      ch.qos.logback.core.OutputStreamAppender.writeOut(OutputStreamAppender.java:193)
      ch.qos.logback.core.OutputStreamAppender.subAppend(OutputStreamAppender.java:237)
      ch.qos.logback.core.OutputStreamAppender.append(OutputStreamAppender.java:102)
      ch.qos.logback.core.UnsynchronizedAppenderBase.doAppend(UnsynchronizedAppenderBase.java:85)
      ch.qos.logback.core.spi.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:51)
      ch.qos.logback.classic.Logger.appendLoopOnAppenders(Logger.java:272)
      ch.qos.logback.classic.Logger.callAppenders(Logger.java:259)
      ch.qos.logback.classic.Logger.buildLoggingEventAndAppend(Logger.java:426)
      ch.qos.logback.classic.Logger.filterAndLog_0_Or3Plus(Logger.java:386)
      ch.qos.logback.classic.Logger.log(Logger.java:780)
      org.apache.kafka.common.utils.LogContext$LocationAwareKafkaLogger.writeLog(LogContext.java:434)
      org.apache.kafka.common.utils.LogContext$LocationAwareKafkaLogger.info(LogContext.java:382)
#229434 "scheduling-4181" virtual
      java.base/java.lang.VirtualThread.park(VirtualThread.java:582)
      java.base/java.lang.System$2.parkVirtualThread(System.java:2639)
      java.base/jdk.internal.misc.VirtualThreads.park(VirtualThreads.java:54)
      java.base/java.util.concurrent.locks.LockSupport.park(LockSupport.java:219)
      java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:754)
      java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:990)
      java.base/java.util.concurrent.locks.ReentrantLock$Sync.lock(ReentrantLock.java:153)
      java.base/java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:322)
      ch.qos.logback.core.OutputStreamAppender.writeBytes(OutputStreamAppender.java:200)
      ch.qos.logback.core.OutputStreamAppender.writeOut(OutputStreamAppender.java:193)
      ch.qos.logback.core.OutputStreamAppender.subAppend(OutputStreamAppender.java:237)
      ch.qos.logback.core.OutputStreamAppender.append(OutputStreamAppender.java:102)
      ch.qos.logback.core.UnsynchronizedAppenderBase.doAppend(UnsynchronizedAppenderBase.java:85)
        ch.qos.logback.core.spi.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:51)
      ch.qos.logback.classic.Logger.appendLoopOnAppenders(Logger.java:272)
      ch.qos.logback.classic.Logger.callAppenders(Logger.java:259)
      ch.qos.logback.classic.Logger.buildLoggingEventAndAppend(Logger.java:426)
      ch.qos.logback.classic.Logger.filterAndLog_0_Or3Plus(Logger.java:386)
      ch.qos.logback.classic.Logger.log(Logger.java:780)
      org.apache.kafka.common.utils.LogContext$LocationAwareKafkaLogger.writeLog(LogContext.java:434)
      org.apache.kafka.common.utils.LogContext$LocationAwareKafkaLogger.info(LogContext.java:382)
      org.apache.kafka.clients.consumer.KafkaConsumer.subscribe(KafkaConsumer.java:912)
      org.apache.kafka.clients.consumer.KafkaConsumer.subscribe(KafkaConsumer.java:944)

There are Kafka consumers that starts together with Spring context and some that starts only during Spring scheduled task (these binds to virtual threads).

Moreover I can see a lot of tomcat virtual threads with completely empty stack frames, e.g.: #230463 "tomcat-handler-109939" virtual

Threads that were supposed to carry virtual threads are not 100% utilized, I can see that some of them are really carrying virtual threads:

#241 "ForkJoinPool-3-worker-4"
      java.base/jdk.internal.vm.Continuation.run(Continuation.java:251)
      java.base/java.lang.VirtualThread.runContinuation(VirtualThread.java:221)
      java.base/java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1423)
      java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:387)
      java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1312)
      java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1843)
      java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1808)
      java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)

while some are idle:

#249 "ForkJoinPool-3-worker-5"
      java.base/jdk.internal.misc.Unsafe.park(Native Method)
      java.base/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
      java.base/java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:1893)
      java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1809)
      java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)

I can simply disable virtual threads support for Spring, but I would like to understand the problem. Is it some kind of deadlock? How to track it?

0

There are 0 answers