SLA Calculation with PromQL

Question

SLA Calculation with PromQL

39 views Asked by Prashant Pandey At 23 February 2024 at 21:13

I have a time-series:

sum(ALERTS{alertname="IngestionStopped", alertstate="firing"} unless on(table) (ALERTS{alertname="MyAlert1",alertstate="firing"} OR ALERTS{alertname="MyAlert2",alertstate="firing"}) OR vector(0))

I do a sum because I have 1 TS for each partition of the table. I am interested even if a single partition has its ingestion stopped.

This TS = 0 when my service. is working fine. If it's > 0, it means there's something wrong with the server. I want to calculate the % of time my service was not working fine (meaning this TS was > 0). How can I do that?

Original Q&A

There are 1 answers

**markalex** · Answer 1 · 2024-02-24T11:05:27+00:00

For any query that produces continuous output of value 0 or 1 you can count an average over time using function of avg_over_time, like this:

avg_over_time( (<your_query>) [range:resolution] )

Where range is time range over which you want to calculate average, and resolution is how often sample of your query should be evaluated within range.
resolution can also be omitted (without omitting :). In that case global evaluation interval (evaluation_interval from config, by default 1m) will be used as a default value.

Since your query produces values other then 1, that for intents of this exercise should be treated as 1, it can be modified by adding > bool 0. It uses boolean comparison to convert all values that satisfy the condition into 1.

So final query would be

avg_over_time(
 (
  sum(
   ALERTS{alertname="IngestionStopped", alertstate="firing"}
   unless on(table) (
     ALERTS{alertname="MyAlert1", alertstate="firing"}
     or ALERTS{alertname="MyAlert2", alertstate="firing"})
   or vector(0))
   > bool 0
 )
 [30d:1m] 
)

Adjust resolution according to your situation, but remember that alert rules are evaluated (and subsequently metric ALERTS updated) only once every evaluation_interval, so no need to go crazy low there.

Demo of similar query can be seen here.

TechQA.

SLA Calculation with PromQL

There are 1 answers

Related Questions in TIME-SERIES

Related Questions in PROMETHEUS

Related Questions in VICTORIAMETRICS

Popular Questions

Trending Questions