My java server started to crash repeatedly, and I can't find why.
I have server with 7.5GB memory and I have allocated 3GB for the java process.
Server was running fine, and ran garbage collection many times, but the JVM crashed when under memory pressure.
Here is the info from JConsole, after the crash:
Current heap size:
2 958 868 kbytes
Maximum heap size:
3 066 816 kbytes
Committed memory:
3 066 816 kbytes
Pending finalization:
0 objects
Garbage collector:
Name = 'PS MarkSweep', Collections = 66, Total time spent = 7 minutes
Garbage collector:
Name = 'PS Scavenge', Collections = 43 055, Total time spent = 44 minutes
Operating System:
Linux 2.6.31-302-ec2
Architecture:
amd64
Number of processors:
2
Committed virtual memory:
8 405 760 kbytes
Total physical memory:
7 882 780 kbytes
Free physical memory:
34 540 kbytes
Total swap space:
0 kbytes
Free swap space:
0 kbytes
I have 0.5 GB after a GC run, so all the time it raises from 0.5 to 3 GB and than fall back to 0.5, it is absolutely not problem with hanging objects. In fact it should throw OutOfMemoryException
instead of crashing. I am using those parameters:
-Xmn256m -Xms768m -Xmx3000m -XX:NewRatio=2 -server -verbosegc -XX:PermSize=256m -XX:MaxPermSize=256m -XX:SurvivorRatio=8 -XX:+UseParallelGC -XX:ParallelGCThreads=2 -XX:+UseParallelOldGC
What is wrong and what shall I do? The output shown was:
Current thread (0x00007fe899755800): JavaThread "508616253@qtp-1871151428-3352" [_thread_in_vm, id=11941, stack(0x00007fe86a4e5000,0x00007fe86a5e6000)]
siginfo:si_signo=SIGSEGV: si_errno=0, si_code=128 (), si_addr=0x0000000000000000
Registers:
RAX=0x00007fe9c60333b8, RBX=0x00007fe899755800, RCX=0x0d00007fe8f58787, RDX=0x00007fe9c6031888
RSP=0x00007fe86a5e3fd0, RBP=0x00007fe86a5e4020, RSI=0x00007fe899755800, RDI=0x00007fe95bae1770
R8 =0x00007fe9be341620, R9 =0x0000000000000001, R10=0x00007fe9c5b84460, R11=0x00007fe9c051a52b
R12=0x00007fe9c051a529, R13=0x00007fe9c6034ac0, R14=0x00007fe9c051a599, R15=0x0900007fe8f58787
RIP=0x00007fe9c5bd562d, EFL=0x0000000000010246, CSGSFS=0x000000000000e033, ERR=0x0000000000000000
TRAPNO=0x000000000000000d
Stack: [0x00007fe86a4e5000,0x00007fe86a5e6000], sp=0x00007fe86a5e3fd0, free space=3fb0000000000000030k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V [libjvm.so+0x64d62d]
V [libjvm.so+0x5fc4df]
Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
v ~RuntimeStub::_complete_monitor_locking_Java
J sun.nio.ch.SocketChannelImpl.write(Ljava/nio/ByteBuffer;)I
J org.mortbay.io.nio.ChannelEndPoint.flush(Lorg/mortbay/io/Buffer;)I
J org.mortbay.jetty.HttpGenerator.flush()J
...
Sounds like a memory leak. The gc can only clean up objects that aren't referenced anymore. And, if your application (or the server itself) doesn't "free" unused ressources, after a while, even 3GB are not enough.
A Profiler might help to identify datastructures that grow unexpectedly.
Idea: start the server with
-verbose:gc
option and check what happens just before it dies. Decrease heap space for the test so that you don't have to wait to long. If it's a memory leak I expect that you see regular full gc cycle where the gc can free less memory each time it runs.Update
I was mislead by the
outofmemoryerror
tag. In fact, it's a JVM crash and all you can do is trying to update the installed Java. There are already some reports on "SIGSEGV (0xb)" crashes for builds 1.6.0_17 and 1.6.0_18 (like this question on SO).It's an JVM internal problem.