Is Prometheus Alertmanager able to discern between event and condition?

405 views Asked by At

We have a kubernetes system that among other activities handling thousands of incoming inputs from sensors. Some sensors can stop reporting from time to time, so we can have an alert about the event of disconnection. When sensor is back we would like also to get an event for this as well. So, between these events (connection and disconnection) the status of a specific sensor can be OK or NOK and we would like to see the status of currently disconnected sensors without going over all the issued events and finding out each time.

Can we do that with Prometheus Alertmanager? If yes, can you please refer to the possible ways to accomplish this? If no, what will be your default way to handle this requirement?

1

There are 1 answers

0
Yiadh TLIJANI On

This has to be managed at Prometheus Server side by adding self-monitoring alerts, and more precisely the PrometheusTargetMissing alert for your case

  - alert: PrometheusTargetMissing
    expr: up == 0
    for: 0m
    labels:
      severity: critical
    annotations:
      summary: Prometheus target missing (instance {{ $labels.instance }})
      description: A Prometheus target has disappeared. An exporter might be crashed.\n  VALUE = {{ $value }}\n  LABELS: {{ $labels }}

Reference: https://awesome-prometheus-alerts.grep.to/rules.html#rule-prometheus-self-monitoring-2