JVM's (JDK 8 before Update 131) running in docker containers were ignoring the CGroup limitations set by the container environment. And, they were querying for host resources and not what was allocated to the container. The result is catastrophic for the JVM i.e As the JVM was trying to allocate itself more resources (CPU or Memory) than what is permitted through CGroup limits, docker demon would notice this and kill the JVM process or the container itself if the java program was running with pid 1.
Solution for memory issue - (possibly fixed in JDK 8 update 131) Like described above, JVM was allocating it's self more memory than what's allowed for the container. This could be easily fixed by
- explicitly setting the max heap memory limit (using
-Xmx
) while starting the JVM. ( prior to 131 update) - or by passing these flags - (after 131 update)
-XX:+UnlockExperimentalVMOptions
and
-XX:+UseCGroupMemoryLimitForHeap
Resolving the CPU issue (possibly fixed in JDK update 212 ) Again like described above, JVM running in docker would look at the host hardware directly and obtain the total CPUs available. Then it would try to access or optimize based on this CPU counts.
- After JDK 8 update 212, any JVM running in docker container will respect the cpu limits allocated to container and not look into host cpus directly.
If a container with cpu limitation is started as below, JVM will respect this limitation and restrict itself to 1 cpu.
docker run -ti --cpus 1 -m 1G openjdk:8u212-jdk
//jvms running in this container are restricted to 1cpu. - HERE IS MY QUESTION: The CPU issue is probabily fixed in JDK8 Update 212, but what if I can not update my JVM and I am running version prior to update 131 , how can I fix the cpu issue.
Linux container support first appeared in JDK 10 and then ported to 8u191, see JDK-8146115.
Earlier versions of the JVM obtained the number of available CPUs as following.
Prior to 8u121, HotSpot JVM relied on
sysconf(_SC_NPROCESSORS_ONLN)
libc call. In turn, glibc read the system file/sys/devices/system/cpu/online
. Therefore, in order to fake the number of available CPUs, one could replace this file using a bind mount:To set only one CPU, write
echo 0
instead ofecho 0-3
Since 8u121 the JVM became taskset aware. Instead of
sysconf
, it started callingsched_getaffinity
to find the CPU affinity mask for the process.This broke bind mount trick. Unfortunately, you can't fake
sched_getaffinity
the same way assysconf
. However, it is possible to replace libc implementation ofsched_getaffinity
using LD_PRELOAD.I wrote a small shared library proccount that replaces both
sysconf
andsched_getaffinity
. So, this library can be used to set the right number of available CPUs in all JDK versions before 8u191.How it works
First, it reads
cpu.cfs_quota_us
andcpu.cfs_period_us
to find if the container is launched with--cpus
option. If both are above zero, the number of CPUs is estimated asOtherwise it reads
cpu.shares
and estimates the number of available CPUs asSuch CPU calculation is similar to how it actually works in a modern container-aware JDK.
The library defines (overrides)
sysconf
andsched_getaffinity
functions to return the number of processors obtained in (1) or (2).How to compile
How to use