Strange JVM thread hangs - suggestions to troubleshoot?

2.9k views Asked by At

While troubleshooting one of the jvm hung issues in our production environment, we encountered that one of threads which executes the following logger statement

logger.debug("Loaded ids as " + ids + ".");

is hung at this step with the thread status as runnable. Here ids is a Set. There is another thread which waits on the above thread through a count down latch to complete its task. The software takes thread dumps every 15 mins and the stacktraces of the two threads look as below

Stack trace for [THREAD GROUP: Job_Executor] [THREAD NAME:main-Runner Thread][THREAD STATE: WAITING]
    ...sun.misc.Unsafe.park(Native Method)
    ...java.util.concurrent.locks.LockSupport.park(Unknown Source)
    ...java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(Unknown Source)
    ...java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(Unknown Source)
    ...java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(Unknown Source)
    ...java.util.concurrent.CountDownLatch.await(Unknown Source)
    ...com.runner.MainRunner.stopThread(MainRunnerRunner.java:1334)


Stack trace for [THREAD GROUP: Job_Executor] [THREAD NAME:task executor][THREAD STATE: RUNNABLE]    
    ...java.util.AbstractCollection.toString(Unknown Source)           
    ...java.lang.String.valueOf(Unknown Source)      
    ...java.lang.StringBuilder.append(Unknown Source)    
    ...com.runner.CriticalTaskExecutor.loadByIds(CriticalTaskExecutor.java:143)

This jvm got hung for almost 24 hrs and finally we had to kill it to move ahead. The thread dumps indicates that there are 43 threads in RUNNABLE state including the above one.

What could be the reasons for the above thread to be in RUNNABLE state for 24 hrs just executing collection.toString()?

Any suggestions on how to proceed?

2

There are 2 answers

10
artbristol On

It depends on the toString() method that's being called. I've seen AbstractCollection.toString fall over when the String being constructed is too large for the heap. Otherwise, the problem could be in the toString of the objects in the collection.

To figure out which one it is, take some more stack dumps (10 or so). The stuck thread will probably usually be in the toString that's causing the problem.

As a quick fix, replace

logger.debug("Loaded ids as " + ids + ".");

with

logger.debug("Loaded ids as {}.", ids);

(Assuming you're using slf4j, otherwise look up the appropriate way to do parameterized logging in your framework).

This will skip the toString if debug is not enabled.

8
Gray On

What could be the reasons for the above thread to be in RUNNABLE state for 24 hrs just executing collection.toString()?

You've not provided enough information to be able to diagnose the problem. I would only challenge you to not assume there is a JVM issue going on here.

If we look at the source for the AbstractCollection.toString() method we see that it iterates through the collection and spits out approximately "[ item0, item1, item2 ]". Each of the item.toString() method is called to show the item.

If the application hang is in the collection toString() then my guess is that there is some problem with the iterator on the collection. You could tell this if your application is spinning -- using close to 100% of a CPU. Maybe the hasNext() method on the Set is always returning true?

If the application hang is in reality inside of the item.toString() then I'd make sure that your item is just displaying simple fields. Be careful of fields that if accessed make RPC calls like lazy loaded ORM wrapped fields.

If you provide details about the Set in question and show the id.toString() code we can help more.

It now sounds like this is a set of Integer objects. No idea why this would hang your application. Here's a couple other ideas:

  • Are you accessing this collection at all in a non-synchronized manner? Could multiple threads have made changes to the collection so that it is corrupted which has caused its iterator to spin? You might try wrapping it in a Collections.synchronizedSet(...).
  • Any chance the Set is huge and you are running close to being out of memory and the program is thrashing? That wouldn't hang your application however but just slow it to a crawl. And you'd start seeing out of memory exceptions.
  • Is there any chance that the toString() is being called over and over again? I assume you would see that in the logs however.