CPU usage reported in /proc/stat is inconsistent (wrong number of cpu ticks)

74 views Asked by At

Summary of the issue:

In order to report CPU usage per core, I decided to rely on /proc/stat file populated by the kernel. I am running an application inside a Docker container, mapped to core 1 on my CPU (through the --cpu-set option when calling docker run). I put no limitation on cpu usage on this core (i.e. no --cpus option).

Among the different threads of my application, one in particular is generating the main part of CPU load. Thanks to the core mapping I know that this CPU load is only on core 1, and as htop is reporting 25% CPU usage for this thread, I would expect at least 25% usage on core 1 in the bar graph. However it is not always the case. I say not "always" because with no obvious reason, the percentage sometimes looks consistent, sometimes not.

Based on the documentation I read, the /proc/stat "file" is supposed to report the accumulated CPU ticks per "category" (i.e. user level, kernel level, idle, etc). So, collecting values every second, summing all columns for core 1, and then computing the difference with last result shall give a value very close to the configured kernel jiffies / sec (i.e. 100 on my system). However I rather gets 73, 75. As I collect the values through a bash script I know the "polling" period will not be very precise, but that does not explain such a big difference (75 vs 100, which by the way is very similar to the 25% load that is missing).

htop inconsistent cpu load

In above screenshot you can see the ordering by CPU usage. I configured htop not to hide kernel tasks. You can see that the 25% load is not reflected in the bar graph on top left corner. For info htop was run with -d 10 option to refresh the values every second. I have not provided a video, but the percentages in bar graph always remain under 10% over time.

Below the /proc/stat extract per second + computed diff

enter image description here

EDIT1: here is a second case I am facing (which highlights the "random" behavior as we now over-estimate):

htop wrong bar graph

And the /proc/stat extract per second + computed diff

wrong /proc/stat content


More details:

  • Observed on a RPi 4 running Debian 11 (bullseye) with kernel
Linux raspberrypi 6.1.19-v8+ #1637 SMP PREEMPT Tue Mar 14 11:11:47 GMT 2023 aarch64 GNU/Linux

I also observed a similar behavior on a x86 machine, running Ubuntu 22.04, with kernel 5.15.0-91-lowlatency #101-Ubuntu SMP PREEMPT, so it looks like it is not happening on a specific proc architecture / OS / kernel.

  • Using Docker version 23.0.2.

  • CPU governor set to performance for the concerned core (but switching from ondemand to performance has no impact on bar graph accuracy). Moreover when ondemand I would rather expect that some idle cycles are missed, whereas here it looks like it is system cycles that are missed (the thread is scheduled with a real-time priority so should be accounted in this category if I understand well).

  • When looking at /proc/process_pid/stat file and computing CPU usage, I get the correct value reported for the thread by htop. Moreover when modulating the load in my thread, this value is evolving as expected.

  • I read somewhere that cgroups v2 are "required" in order to have proper values when running Docker in rootless mode, so I am using it. Despite that I still get inconsistent values in /proc/stat in a random manner.

  • The HZ configured on my system is 100:

pi@raspberrypi:~ $ getconf CLK_TCK
100
  • The column for irq is always null in my /proc/stat file. So I suspected at a time that most of the load came from IRQ processing and so was not properly reported. However looking at the /proc/interrupts file I discovered that handling of IRQ related to the network (because the thread also sends and receives some TCP packet) is done on core 0, so in case the load was mainly related to IRQ processing I would see the load on core 0, and it is not the case.

On core 1 I observed the following values without being able to say if the occurences for arch_timer are excessive or not per second:

interrupts count

Questions:

Do you know what could explained this incorrect reporting of CPU ticks count in /proc/stat? I heard of NOHZ option in the kernel, does it look relevant for this case?

0

There are 0 answers