Loss of data when using the VictoriaMetrics running_sum(increase(my_metric))

182 views Asked by At

I'm trying to capture all metrics calculated from a Java Spring application (counters) over time and keeping the data even with restarts of the monitored application, that is, continuing where I left off.

I was using Prometheus for this type of monitoring, but it provided very wrong data. I saw on forums that using VictoriaMetrics would be more accurate, in fact! But there are still errors, it loses data as we restart the spring application.

As in the image below. Total number is correct, but success (sucesso) and failure (falha) numbers are incorrect - success should be 2 and failure 6. enter image description here

Accuracy was improved by following the forum tips: #80. However, it still has errors.

My question is whether the function to be used for accounting is really the running_sum(increase(my_metric[$__interval])). And if there is a way to reduce the amount of error or even reset it to zero.

I was hoping running_sum(increase(my_metric)) would give me a reliable number

1

There are 1 answers

2
Dmytro Kozlov On

increase function on counters will not go to zero even if the counter starts from zero. enter image description here

Your query with increase function will only change the output value.

The increase() function in VictoriaMetrics detects counter reset only if the fisrt sample after counter reset is smaller than the last sample just before counter reset. If the first sample after counter reset equals to or bigger than the last sample before counter reset, then VictoriaMetrics has no data, which can help detecting the counter reset, so the counter reset is left unnoticed. If you need calculating the exact number of events over some interval, then it may be better storing the sum of events between scrapes and then use sum_over_time(m[d]) for these calculations