I'm trying to turn a Nagios-NRPE check into a Check_MK one. The first one is:
check_procs -w 10 -c 15 -C crond
My attempt is to use the State and coung processes
rule but it always raise a critical alert. The parameters of my rule are (extracted from the rules.mk
configuration file):
'process': 'crond'
'okmax': 10
'okmin': 1
'warnmax': 15
'warnmin': 11
As the WATO config screen says nothing about critical thresholds, I have guessed the values outside these thresholds above raise a critical alert.
My problem is: when this rule is active, an critical alert is raised even when the number of processes found is inside the OK threshold.
The Status detail
of the alert is
CRIT - 7 processes (ok from 1 to 15)CRIT 1620.6 MB virtual, 28.2 MB resident, 2.7% CPU
Then, I cannot understand this behaviour and I feel that I misunderstand the check_MK threshold parameters or I'm missing something.
Can you help me?
Thanx in advance.
As I suspected in my question last paragraph, I misunderstood the check_MK threshold parametes.
These are the python code lines found in
~/share/check_mk/checks/ps
:So any value lower than
warnmin
raises a critical alert. Thus, in order to prevent this, thewarn
interval must include theok
one. In my example, thewarmin
value should be lowered down to match theokmin
one.In mathematical terms, the
ok
interval must be a subinterval ofwarn
one.I wrongly guessed these intervals should not overlap, but actually they must.