Prometheus metrics are not coming when it takes more time than scrap interval

1.3k views Asked by At

Overview

I am new to Prometheus. My Custom Exporter takes IP addresses as input. Does some stream processing and then registers new metrics to Prometheus registry. Below is the code,

#!/usr/bin/env python3

from prometheus_client import start_http_server, Gauge, Counter
from prometheus_client import REGISTRY
from prometheus_client.metrics_core import GaugeMetricFamily
import time

count = 0

class TestExporter:
    
    def collect(self):
        global count
        count += 5
        
        for i in range(3):
            yield self.check_stream(count, i)
    
    def check_stream(self, count, host):
        # some processing time
        # lets assume 5 seconds
        time.sleep(5)
        metric = GaugeMetricFamily('aa_stream_test', 'testing stream delete', labels=['stream'])
        if count > 100:
            metric.add_metric(['B', str(host)], count)
        else:
            metric.add_metric(['A', str(host)], count)
        print (count, 'registering . . .')
        return metric

if __name__ == '__main__':
    
    REGISTRY.register(TestExporter())
    
    # Start up the server to expose the metrics.
    start_http_server(8000)
    print("started server...")

    while True:
        time.sleep(1)

Issue

In check_stream function, I have defined estimated time that It can take to process the request. As of now, I have put as 5 seconds. (But it can vary)

The check_stream function gets executed for 3 times, so if I use function time.sleep(3), I can see metrics in prometheus graph. because total time is 3 * 3 = 9 which is less than 10sec prometheus scrap time. But if I use time.sleep(5), It takes total 5*3=15, which is >= scrap interval time. Here I don't see any prometheus values in graph. It is empty.

Please help. Thanks.

1

There are 1 answers

1
DazWilkin On

Prometheus configuration has a scrape_timeout that defaults to 10 seconds.

Prometheus attempts to scrape your target and, if the target takes longer than the scrape_timeout to respond, then it will timeout (and not scrape the metrics).

You should increase the scrape_timeout in your Prometheus server config so that server does not timeout when attempting to scrape your target.