Prometheus Expressions for CPU and Memory Usage Conditions

137 views Asked by Ron Sharabi At 21 January 2024 at 16:40

I'm setting up Grafana alerts and need guidance on the conditions. I want two separate alerts to trigger if, over the last 10 minutes, the average CPU usage in any pod (across all namespaces) exceeds 90% of the respective pod's CPU limit, and similarly for Memory usage.

Can someone help with the expressions for these scenarios?

I tried this for Memory usage: avg_over_time(container_memory_usage_bytes[10m]) >= kube_pod_container_resource_limits{resource="memory"} * 0.9

This of course didn't work. I'm expecting it to return the pods that over the last 10 minutes, the average CPU/Memory usage exceeds 90% of the actual pod's CPU/Memory limit.

Update: I think I managed to build one of the queries I wanted but for a specific pod. Here is the query for Memory: avg_over_time(container_memory_usage_bytes{pod="nginx-f7d787f6c-t8x9s", container="nginx"}[10m]) > on(pod_uid) kube_pod_container_resource_limits{resource="memory", pod="nginx-f7d787f6c-t8x9s", container="nginx"} * 0.9

I need this query to run for the entire pods in the cluster and not just for a specific pod.

Example:

pod name	Containers	Memory usage / 10 minutes	Memory limit
Pod1	5	9Mi	10Mi
Pod2	3	9Mi	1000Mi

Since the containers of Pod1 are using 90% of their Memory limit, I expect them to show in the query result.

Original Q&A

TechQA.

Prometheus Expressions for CPU and Memory Usage Conditions

There are 0 answers

Related Questions in PROMETHEUS

Related Questions in GRAFANA

Related Questions in MONITORING

Related Questions in PROMQL

Related Questions in GRAFANA-ALERTS

Popular Questions

Trending Questions