Unexplained increasing RES (resident) memory usage of java process

118 views Asked by At

I'm investigating memory issues in a Java 17 spring boot 3 application that was ported from Glassfish/Java 8. It runs on kubernetes.

  • The container memory request = limit = 5.5Gib. The heap is -Xms = -Xmx = 3.5Gib.
  • The container is long lived and runs a java process that uses the quartz scheduler to run jobs.

Eventually containerd is oom-killing the java process as it hits the container memory limit. Monitoring in jconsole shows gc is keeping heap comfortably within the 3.5Gib. Enabling NMT (Native Memory Tracker) does not show any non-heap memory pools of concern size-wise.

I also use NMT to produce a baseline before running of a "heap intensive" short lived job, and then running a comparison against that baseline after it finishes. I also monitor with jconsole, and top. What I find is that top reports that the job execution causes a permanent increase in RES memory in the java process; even though neither jconsole or NMT suggest any reason for this.

For NMT's Total Committed:

  • NMT before job run: 4.535Gib (4755595KB). top on container reports RES. i.e. process resident memory of 3.1Gib (strange??)
  • NMT after job 4.56Gib (4778088KB up by only 22493KB) (top on container reports a massive increase for RES. i.e. resident memory of 4.9Gib - and it never returns to a lower value)

Using jconsole through this job execution shows the Heap usage staying well within the -Xmx and regular gc cycles doing their job; and heap usage returns to normal levels after the job.

The NMT output:

Create baseline pre run:

[root@host-1 opt]# sudo -u appuser /opt/java/openjdk/bin/jcmd 191 VM.native_memory baseline
191:
Baseline succeeded

And then post job generating a diff:

[root@host-1 opt]# sudo -u appuser /opt/java/openjdk/bin/jcmd 191 VM.native_memory summary.diff
191:

Native Memory Tracking:

(Omitting categories weighting less than 1KB)

Total: reserved=5558072KB +11469KB, committed=4778088KB +22493KB

-                 Java Heap (reserved=3670016KB, committed=3670016KB)
                            (mmap: reserved=3670016KB, committed=3670016KB)
 
-                     Class (reserved=331083KB +223KB, committed=26571KB +543KB)
                            (classes #33968 +545)
                            (  instance classes #31813 +515, array classes #2155 +30)
                            (malloc=3403KB +223KB #83118 +5582)
                            (mmap: reserved=327680KB, committed=23168KB +320KB)
                           : (  Metadata)
                            (    reserved=196608KB, committed=161088KB +4736KB)
                            (    used=160529KB +4619KB)
                            (    waste=559KB =0.35% +117KB)
                           : (  Class space)
                            (    reserved=327680KB, committed=23168KB +320KB)
                            (    used=22637KB +321KB)
                            (    waste=531KB =2.29% -1KB)
 
-                    Thread (reserved=192733KB +9279KB, committed=20945KB +995KB)
                            (thread #0)
                            (stack: reserved=192188KB +9252KB, committed=20400KB +968KB)
                            (malloc=327KB +16KB #1128 +54)
                            (arena=218KB +11 #373 +18)
 
-                      Code (reserved=254471KB +1005KB, committed=105347KB +15257KB)
                            (malloc=6787KB +1005KB #31852 +3340)
                            (mmap: reserved=247684KB, committed=98560KB +14252KB)
 
-                        GC (reserved=243202KB +494KB, committed=128490KB +494KB)
                            (malloc=13346KB +494KB #39503 +4498)
                            (mmap: reserved=229856KB, committed=115144KB)
 
-                  Compiler (reserved=1237KB +48KB, committed=1237KB +48KB)
                            (malloc=1072KB +48KB #3239 +180)
                            (arena=165KB #5)
 
-                  Internal (reserved=7246KB +271KB, committed=7246KB +271KB)
                            (malloc=7210KB +271KB #38329 +4138)
                            (mmap: reserved=36KB, committed=36KB)
 
-                     Other (reserved=587124KB +186KB, committed=587124KB +186KB)
                            (malloc=587124KB +186KB #99 +6)
 
-                    Symbol (reserved=38580KB +343KB, committed=38580KB +343KB)
                            (malloc=36334KB +279KB #888476 +9518)
                            (arena=2246KB +64 #1)
 
-    Native Memory Tracking (reserved=17131KB +436KB, committed=17131KB +436KB)
                            (malloc=38KB +1KB #573 +12)
                            (tracking overhead=17093KB +435KB)
 
-        Shared class space (reserved=16384KB, committed=12056KB)
                            (mmap: reserved=16384KB, committed=12056KB)
 
-               Arena Chunk (reserved=479KB -991KB, committed=479KB -991KB)
                            (malloc=479KB -991KB)
 
-                   Tracing (reserved=32KB, committed=32KB)
                            (arena=32KB #1)
 
-                    Module (reserved=792KB +93KB, committed=792KB +93KB)
                            (malloc=792KB +93KB #4842 +285)
 
-                 Safepoint (reserved=8KB, committed=8KB)
                            (mmap: reserved=8KB, committed=8KB)
 
-           Synchronization (reserved=150KB +11KB, committed=150KB +11KB)
                            (malloc=150KB +11KB #1575 +105)
 
-            Serviceability (reserved=1KB, committed=1KB)
                            (malloc=1KB #12)
 
-                 Metaspace (reserved=197375KB +59KB, committed=161855KB +4795KB)
                            (malloc=767KB +59KB #558 +81)
                            (mmap: reserved=196608KB, committed=161088KB +4736KB)
 
-      String Deduplication (reserved=1KB, committed=1KB)
                            (malloc=1KB #8)
 
-           Object Monitors (reserved=29KB +13KB, committed=29KB +13KB)
                            (malloc=29KB +13KB #141 +66)
 
[root@host-1 opt]# 

Meaning of the field RES (Resident Memory Size in KiB): Stands for a subset of the virtual memory space (VIRT) representing the non-swapped physical memory a task is currently using.

The confirms my expectation that RES should be fairly close to NMT Total Committed. Note this article is related but didn't answer my question: Difference between Resident Set Size (RSS) and Java total committed memory (NMT) for a JVM running in Docker container - it seems to more suggest that NMT total committed may often be higher than RES - the opposite to what I'm seeing.

My questions are:

  1. why is there such a discrepancy between top RES Header value (4.9Gib) and java's NMT total committed (3.5Gib)?
  2. am I interpreting the output of NMT/top/jconsole correctly?
  3. what tools should I be using to track down the root of why RES is so high given that it seems not to be a JVM heap size issue; and NMT seems to suggest it is not native memory either (but this leaves me asking what other memory is there??)
  4. Any other pointers??
0

There are 0 answers