NMI watchdog messages, ie. 'Shutting down hard lockup detector on all cpus'

4.2k views Asked by At

When NMI watchdog has been "disabled" it is still chatty.

Does anyone know where the docs for these messages live? I'd like to see what is actually happening.

For example, verified that it is disabled:

 $ cat /proc/sys/kernel/nmi_watchdog
0

YET, we still see messages like the following on shutdown or boot:

$ journalctl -xn 100000  | grep "NMI watchdog"
Oct 23 14:29:31 hostname-us kernel: NMI watchdog: disabled (cpu0): hardware events not enabled
Oct 23 14:29:31 hostname-us kernel: NMI watchdog: Shutting down hard lockup detector on all cpus

Now I know that this isn't a RESET, it's something else and I'd like to have the documented answer, not a best guess.

Tried looking through kernel.org and debian.org, man pages with no luck, only archived bugzilla pages.

We'd like to know what these messages actually mean, not make assumptions. Does anyone know where the decoder ring lives ?

2

There are 2 answers

0
Melanie Cooper On

Should have known it wasn't an exact match but managed to finally find it on kernel.org

https://www.kernel.org/doc/html/latest/admin-guide/sysctl/kernel.html

1
Milton Osses On

From http://slacksite.com/slackware/nmi.html

The NMI watchdog is a kind of timer event handler, it checks Local APIC or IO-APIC interrupt counter register when it is called on every local timer event of each CPU. Generally speaking there could be hundreds of device and timer interrupts are received per second. If there are no interrupts received in a 5 second interval, the NMI watchdog assumes that the system has hung and initiates a kernel panic. This is very helpful when you need some data for investigating the issue , but occasionally it may have such undesirable effects.