I have RRD Files with multiple months of PDP Data (5min Interval).
For general purpose Graphs its fine, when rrdtool automatically decides which RRA to use for displaying the Graph.
But some of my Graphs contain 95-Percentile Data in the legend, which I need to be calculated from "exact" 5min-Interval Data, because calculation of Percentile from aggregated Data-Points can (by it's nature) lead to dramatically incorrect values.'
- I can
fetchData from RRD File with a step of 300 and I'll get the right data to calculate percentile on my own - PROBLEM: When graph'ing with a step of 300, the displayed Percentile value varies depending on the
widthof the Graph, even if the Time-Range is the same, and 300s Data is available for the whole Time-Range - if width for 1-month graph is 800px, the shown Percentile (and also max-values e.g) is wrong
- if width for 1-month graph is 8000px, the Values are correct (matching the self-calculated values from fetch'ed data)
graph:
...
--step 300
...
"VDEF:perca=a,95,PERCENT",
...
created with:
'-s', '300',
...
"RRA:AVERAGE:0.5:1:53568", # 6 months pdp
"RRA:AVERAGE:0.5:12:8904", # 1 hour, 1 year.
"RRA:AVERAGE:0.5:288:730", # 1 day, 2 years.
"RRA:AVERAGE:0.5:2016:520", # 1 week, 10 years.
"RRA:MAX:0.5:1:600", # 5 min: 2 days
"RRA:MAX:0.5:12:8904", # 1 hour, 1 year.
"RRA:MAX:0.5:288:730", # 1 day, 2 years.
"RRA:MAX:0.5:2016:520", # 1 week, 10 years
This is due to data consolidation being performed prior to the
VDEFcalculation.Although your
rrdtool grapharguments specify a step of 300s, this is less width than a pixel of the graph, and so the data series are further averaged before you get to theVDEF. All theCDEFandVDEFfunctions will always work with a time series of one cdp per pixel. From the RRDTool manual:This means that, while you can decrease the resolution of the data, you cannot increase it. Sadly, to get an accurate 95th Percentile, you need higher-resolution data.
So, if you omit the
--step 300in a narrow graph, what will happen is:With the
--step 300it is slightly different process, but the same result:So, you can see the final outcome is the same - its just where the 300s -> 1h consolidation happens, either in the RRA or at graph time.
When using a wide graph, the time per pixel becomes smaller, and RRDTool then no longer needs to perform its additional consolidation of the data, resulting in a more accurate calculation:
When you retrieve the raw data using
rrdtool fetch1 then this extra consolodation doesn't happen, so you get:Your next question will likely be, how do I stop this from happening? The unfortunate answer is that you cannot. RRDTool does not have a Percentile type CF, and so the correct calculations cannot be performed in the RRA (this would be the only real solution).
The Routers2 frontend for MRTG calculated 95th Percentiles for the graphs, and the way it does it is to perform a high-resolution
fetchto get the raw data and calculates the value internally before passing this in aHRULEwhen making the graph. In other words, it doesn't use aVDEFat all, due to this problem you are experiencing.