rrdtool - amend a HWPREDICT rra

350 views Asked by At

I am using rrdtool to collect and process some system metrics.

I have been experimenting with the rather marvelous HWPREDICT feature, that allow you to do Holt-Winters seasonal forecasting for aberration detection.

However I've now hit a problem, in that when I started, I set my HWPREDICT parameters to have a "season" a day long:

rrdtool create <filename> -s 300 DS:curr_sessions:GAUGE:600:0:U RRA:AVERAGE:0.5:1:2880 RRA:AVERAGE:0.5:12:2016 RRA:AVERAGE:0.5:60:2400 RRA:HWPREDICT:1440:0.1:0.0001:288

288 samples with a 5 minute interval means a day long 'season'.

What I'd like to do is extend that, such that my 'seasons' are 2016 samples - a week.

E.g.

RRA:HWPREDICT:4032:0.1:0.0001:2016

But I'm having difficulty figuring out if there's any way I can do this without resetting my source data. I have looked at rrddump which lets you dump/restore. This exports XML but preserves the RRA structure.

rrdtune lets you adjust some of the parameters for alpha/beta/gamma, but not the season length.

And rrdresize lets you amend the length of RRA, but not the length of season.

Does anyone have a good solution that lets me recreate my rrd but preserve the data held in it? (I don't mind 'losing' my HWPREDICT RRA data as changing the seasonal period pretty much invalidates it anyway, but would quite like to keep my existing data in the other RRAs).

For bonus points - I'm most familiar with perl, so don't mind having a fiddle with perl/XML if anyone's able to give me a point in the right direction. (Can I 'fudge' the exported XML somehow, to just 'feed' it a new RRA with a bunch of uninitialised values?)

1

There are 1 answers

0
Sobrique On BEST ANSWER

Partial answer so far:

Create new rrd file with new parameters. Include --start to allow you to back date far enough. (Find lowest 'timestamp' in the dumped XML).

Process dumped XML and extract data. There's a gotcha here - rrdtool won't accept 'clashing' timestamps if you have overlapping RRAs. For example, if you've a MAX and an AVERAGE.

I've chosen to pick out MAX as that's probably most useful to me.

I have also done it 'in order' because I have specified my initial RRAs in the appropriate order (e.g. smallest resolution first). This'll probably work otherwise, but bear in mind you might get some different values looking at a coarser resolution 'MAX'.

#!/usr/local/bin/perl

use strict;
use warnings;

use XML::Twig;

my %update_vals;

sub parse_rra {
    my ( $twig, $rra ) = @_;

    return unless $rra->first_child_text('cf') eq 'MAX';
    my $step      = $twig->root->first_child_text('step');
    my $lastupd   = $twig->root->first_child_text('lastupdate');
    my $base_time = $lastupd - $lastupd % $step;
    my $pdp_step  = $rra->first_child_text('pdp_per_row');

    my @row_vals;
    foreach my $row ( $rra->first_child('database')->children('row') ) {
        my $val = $row->first_child_text('v');
        push( @row_vals, $val );
    }

    my $start_rra = $base_time - $#row_vals * $step * $pdp_step;
    foreach my $value (@row_vals) {
        if (   not $value eq "NaN"
            or not defined $update_vals{$start_rra} )
        {
            $update_vals{$start_rra} = $value;
            $start_rra += $step * $pdp_step;
        }
    }
}

my $target_file      = "your_rrd_dump.xml";
my $destination_file = "new.rrd";

my $twig = XML::Twig->new( 'twig_handlers' => { 'rra' => \&parse_rra } )
    ->parsefile($target_file);

foreach my $timestamp ( sort keys %update_vals ) {
    print
        `rrdtool update $destination_file $timestamp:$update_vals{$timestamp}`;
}

Now, so far - all this does is 'replay' all the timestamps you have available. Unfortunately - it simply doesn't work (based on RRA config) for any consolidated data points, because you don't have enough samples at that resolution (e.g. 1 per day).

So as a second approach (edited):

  • Create your 'source' RRD.
  • Create a 'new' RRD with the desired parameters.

Process both using XML::Twig - the first you preserve 'non HWPREDICT' data and delete the HWPREDICT stuff. The second you process and splice the HWPREDICT RRAs into the previous, to create a new XML dump... which you restore.

Note - this'll may well not work if you've wildly differing data. (and note - HWPREDICT stuff being re-initialised might take a while to 'settle down' again).

#!/usr/local/bin/perl

use strict;
use warnings;

use XML::Twig;

my @cf_to_replace =
    qw ( HWPREDICT MHWPREDICT SEASONAL DEVSEASONAL DEVPREDICT FAILURES );
my %hwp_cf = map { $_ => 1 } @cf_to_replace;

my $xml_doc;

sub discard_hwp {
    my ( $twig, $rra ) = @_;
    my $CF = $rra->first_child_text('cf');
    if ( $hwp_cf{$CF} ) { $rra->delete }
}

sub splice_hwp {
    my ( $twig, $rra ) = @_;
    my $CF = $rra->first_child_text('cf');
    if ( $hwp_cf{$CF} ) {
        $rra->cut;
        $rra->paste( 'last_child', $xml_doc->root );
    }
}

my $orig_file             = 'your_xml_dump.xml';
my $newly_created_rrddump = 'new_params.xml';

$xml_doc = XML::Twig->new(
    'pretty_print'  => 'indented',
    'twig_handlers' => { 'rra' => \&discard_hwp }
)->parsefile($orig_file);

my $new = XML::Twig->new( 'twig_handlers' => { 'rra' => \&splice_hwp } )
    ->parsefile($newly_created_rrddump);

open( my $output, ">", "new.xml" ) or die $!;
print {$output} $xml_doc->sprint;
close($output);

What this is doing is basically running through two separate 'rrdtool dumps' and splicing together the RRAs. I'll give this a try and see if it actually does what I need it to - I won't know until the Holt-Winters functions settle down.