I'm trying to turn a Nagios-NRPE check into a Check_MK one. The first one is:
check_procs -w 10 -c 15 -C crond
My attempt is to use the State and coung processes rule but it always raise a critical alert. The parameters of my rule are (extracted from the rules.mk configuration file):
'process': 'crond'
'okmax': 10
'okmin': 1
'warnmax': 15
'warnmin': 11
As the WATO config screen says nothing about critical thresholds, I have guessed the values outside these thresholds above raise a critical alert.
My problem is: when this rule is active, an critical alert is raised even when the number of processes found is inside the OK threshold.
The Status detail of the alert is
CRIT - 7 processes (ok from 1 to 15)CRIT 1620.6 MB virtual, 28.2 MB resident, 2.7% CPU
Then, I cannot understand this behaviour and I feel that I misunderstand the check_MK threshold parameters or I'm missing something.
Can you help me?
Thanx in advance.
As I suspected in my question last paragraph, I misunderstood the check_MK threshold parametes.
These are the python code lines found in
~/share/check_mk/checks/ps:So any value lower than
warnminraises a critical alert. Thus, in order to prevent this, thewarninterval must include theokone. In my example, thewarminvalue should be lowered down to match theokminone.In mathematical terms, the
okinterval must be a subinterval ofwarnone.I wrongly guessed these intervals should not overlap, but actually they must.