in my setup I have a java component reading data from YARN manager and exposing results of various jobs as metrics. For example I have a metrics with job duration which just holds duration of last app run. It may look like this:
duration_time_millis{job="probe",app_name="import-results",app_type="MAPREDUCE",status="SUCCEEDED"}
1991392 @1542770979.823
1991392 @1542770994.823
1991392 @1542771009.823
...
265722 @1542781554.823
265722 @1542781569.823
265722 @1542781584.823
...
The thing is I am scraping the expose server every 15s or so, but the jobs runs irregulary once per several hours. That means over past 6 hours I am getting 563x the first value and 520x the second value. As there is only one change in the interval.
Is there a way how to compute avg
or stddev
only on distinct values? Getting the number of distinct values would also mean better handling in histograms and heatmaps in grafana where count_values
does not seem to be a good solution.
Thanks for any help on this!