I am using rrdtool
to collect and process some system metrics.
I have been experimenting with the rather marvelous HWPREDICT
feature, that allow you to do Holt-Winters seasonal forecasting for aberration detection.
However I've now hit a problem, in that when I started, I set my HWPREDICT
parameters to have a "season" a day long:
rrdtool create <filename> -s 300 DS:curr_sessions:GAUGE:600:0:U RRA:AVERAGE:0.5:1:2880 RRA:AVERAGE:0.5:12:2016 RRA:AVERAGE:0.5:60:2400 RRA:HWPREDICT:1440:0.1:0.0001:288
288 samples with a 5 minute interval means a day long 'season'.
What I'd like to do is extend that, such that my 'seasons' are 2016 samples - a week.
E.g.
RRA:HWPREDICT:4032:0.1:0.0001:2016
But I'm having difficulty figuring out if there's any way I can do this without resetting my source data. I have looked at rrddump
which lets you dump/restore. This exports XML but preserves the RRA structure.
rrdtune
lets you adjust some of the parameters for alpha/beta/gamma, but not the season length.
And rrdresize lets you amend the length of RRA, but not the length of season.
Does anyone have a good solution that lets me recreate my rrd
but preserve the data held in it? (I don't mind 'losing' my HWPREDICT RRA data as changing the seasonal period pretty much invalidates it anyway, but would quite like to keep my existing data in the other RRAs).
For bonus points - I'm most familiar with perl, so don't mind having a fiddle with perl/XML if anyone's able to give me a point in the right direction. (Can I 'fudge' the exported XML somehow, to just 'feed' it a new RRA with a bunch of uninitialised values?)
Partial answer so far:
Create new
rrd
file with new parameters. Include--start
to allow you to back date far enough. (Find lowest 'timestamp' in the dumped XML).Process dumped XML and extract data. There's a gotcha here - rrdtool won't accept 'clashing' timestamps if you have overlapping RRAs. For example, if you've a
MAX
and anAVERAGE
.I've chosen to pick out
MAX
as that's probably most useful to me.I have also done it 'in order' because I have specified my initial RRAs in the appropriate order (e.g. smallest resolution first). This'll probably work otherwise, but bear in mind you might get some different values looking at a coarser resolution 'MAX'.
Now, so far - all this does is 'replay' all the timestamps you have available. Unfortunately - it simply doesn't work (based on RRA config) for any consolidated data points, because you don't have enough samples at that resolution (e.g. 1 per day).
So as a second approach (edited):
Process both using
XML::Twig
- the first you preserve 'non HWPREDICT' data and delete theHWPREDICT
stuff. The second you process and splice theHWPREDICT
RRAs into the previous, to create a new XML dump... which you restore.Note - this'll may well not work if you've wildly differing data. (and note -
HWPREDICT
stuff being re-initialised might take a while to 'settle down' again).What this is doing is basically running through two separate 'rrdtool dumps' and splicing together the
RRA
s. I'll give this a try and see if it actually does what I need it to - I won't know until the Holt-Winters functions settle down.