We have an amazingly elusive jvm crash occuring on an Ubuntu server that runs on AWS.
Our JVM crashes while crawling a few web pages.
The crash occurs at line 308 of the "safepoint" cpp module. At the stage where a gauranteeArmed==0 statement occurs.
Our sysadmin has advised that , at the time of crashing, there are a massive amount of threads created by the JVM.
We have not reproduced this bug in other Linux or OSX boxes.
We use the Ning library to crawl a few Web pages.
Related Posts
In each of these posts a "safepoint" related crash which comes from "nowhere" was observed. Most interestingly, the first above post actually exhibits a JVM crash during network related events.
The cryptic nature of this bug leads me to believe that there is a bug related to thread creation and scheduling which is specific to our current version of Ubuntu with respect to the way java invokes some of its concurrency features, or some underlying library incompatibility that is highly idiosyncratic to our particular situation.
My Question(s)
My main question here is - what is the best method for debugging a JVM stack trace involving these "safepoints", and where can I get started learning about dealing with such errors ? There have been other questions along this line, but I have not seen a generic answer .
Secondary, any insight into aws, java, networking, and how Ubuntu might behave differently in the cloud would be useful here.
Try using the very latest JVM (6u32 or 7u4) and see if it's still reproducible. If you are on an older version, there's at least a decent chance it's already been fixed in the latest.