Rrdtool graphing a digital event without jagged edges

577 views Asked by At

I'm reading my energy meter output to keep track of actual energy used, etc. The energy cost is calculated with a tarif, and that changes during the day from one state to another (1 or 2). When I graph the tarif together with the actual usage over time, the sharp edge of the tarif state gets jagged, possibly caused by the averaging.

I have used DS:tarif:GAUGE:... to setup the db, and I'm using DEF:tarif:xx.rrd:tarif:AVERAGE to graph.

How do I record and graph a "digital" signal with sharp edges?

2

There are 2 answers

0
Steve Shipway On

There are two things that you might be referring to here.

First, make sure you are not using --slope-mode. This option uses slopes rather than steps in the graph; this sound like the wrong option for you.

Next, you have the problem that your tariff DS is varying between two values (lets say a and b), but when you look at the higher-level graphs, this starts to average out and looks wrong.

When looking at your high-granularity (IE, close up) graphs, you have one (or more) pixels per RRA data point, and one sample per RRA data point. So, you'll see either a or b. However, when you move up to a lower granularity (IE, further away) graph, RRDtool will start to have multiple data points per pixel. Depending on your RRD file definition, rrdtool will then either consolodate on the fly, or move to using a different RRA with more samples per datapoint.

So, this means that you have multiple samples per pixel, and they need to somehow be combined. By default, RRDTool will average them, which can result in the jagged behaviour.

However, what do you want to happen? If the time interval corresponding to a single pixel has 2 instances of a and three of b, what should the graph show?

Here are a couple of suggestions on how you might do it.

  1. Use background.

Since your tariff has only two values, you could use this to colour the background - eg, make a red or green background depending on tariff, and then draw your usage graph line over the top of this. Background colour can be done by having an area from 0 to inf using a semi-transparent colour like #ff808080

  1. Use a special MAX (or MIN) RRA

Maybe you just want to display the maximum tariff for that period. So, you can create an additional MAX type RRA for each consolodation interval, and graph the MAX. Of course, this means that when you have 1 pixel = 1 day, you'll be seeing just the higher value as a straight line. I suppose a 'median' consolodation would be useful here, but for obvious reasons RRDTool doesn't have this.

  1. A bigger graph

A large graph means more pixels, which means less averaging required.

  1. Display your data differently

What are you trying to visualise? Possibly you don't need to see the average cost per unit over this time window, and a calculated sum of units x tariff would work better - particularly since this would be able to be averaged up without problem.

  1. Don't use RRDTool

RRDTool is designed to progressively normalise, summarise and expire regular time-series data over time, and does so very efficiently. However, if you're interested in having the exact data forever then maybe you need a different database.

  1. Pre-normalise your sampling times

If your data change frequently, then Data Normalisation might be changing them. Make sure you always store the data on a time step boundary - if your RRD step is 300 (5min) then ensure you store the data at timestamps which are a multiple of 300, and don't use N (meaning 'whatever the time is Now').

0
paulv On

Thank you for this very elaborate answer. Because I have not been able to find suitable answers to my problem, your summary, and my situation, may hopefully also help others.

Looking back, I should have included more information, and before addressing your points, let me do that right now:

Here is a snippet of my database definition script:

# 5 min sample rate = 300 step rate
# 1D report: 1 step every 5 minutes (5 min sample rate=1), 20 per hr, 24hrs = 480 slots
# 1Wk report: 1 step every hour (20 x 5 min samples=20), 24 hrs, 7 days = 168 slots
# 1M report: 1 step every hour (20 x 5 min samples=20), 24 hrs, 30 days = 720 slots
# 1Y report: 1 step every day (24 Hr * 20 samples=480), 365 days = 365 slots
rrdtool create energy_mon.rrd --step 300 --start 1480943366 \
DS:meter_total:COUNTER:600:U:U \
DS:meter_low:COUNTER:600:U:U \
DS:meter_hi:COUNTER:600:U:U \
DS:energy:GAUGE:600:U:U \
DS:tarif:GAUGE:600:U:U \
RRA:AVERAGE:0.5:1:480 \
RRA:AVERAGE:0.5:20:168 \
RRA:AVERAGE:0.5:20:720 \
RRA:AVERAGE:0.5:480:365

Here is a snippet of my graphing script:

#daily
rrdtool graph $GDIR/energy_daily.png --start -1d \
-w 675 -h 250 \
--vertical-label "KWatt" \
--lower-limit=0 \
--watermark "`date`" \
DEF:energy=$DIR/energy_mon.rrd:energy:AVERAGE \
LINE1:energy$GREEN_COLOR:"Energie" \
DEF:tarif=$DIR/energy_mon.rrd:tarif:AVERAGE \
LINE1:tarif$BLACK_COLOR:"Tarief"

#two days
rrdtool graph $GDIR/energy_2days.png --start -2d \
-w 675 -h 250 \
--vertical-label "KWatt" \
--lower-limit=0 \
--watermark "`date`" \
DEF:energy=$DIR/energy_mon.rrd:energy:AVERAGE \
LINE1:energy$GREEN_COLOR:"Energie" \
DEF:tarif=$DIR/energy_mon.rrd:tarif:AVERAGE \
LINE1:tarif$BLACK_COLOR:"Tarief"

Here is a graph of the "1 day" result: 1 day Here is a graph of the "last 48 hrs" : 2 day

In order to see my "tarif" on the graphs together with the actual energy usage in KWatts, I multiply the 1 (low) or 2 (normal) setting by 100 and store that number in the database. On weekdays, the tariff changes at 23:00 hrs to "low" and back to normal at 07:00 hrs. On the "daily" graph this is reported correctly, albeit with a 5 min. sample resolution "error".

As you can see, the "last 48 hrs" set by "-2d" in the graphing instructions, should be using the same RRA data from the database, but the tarif line does not switch from 100 to 200, but "sits" at about half way for approx. 1-1/2 hours in time, or more than 30 samples. This is not a rounding based on a pixel level resolution (;-).

The only explanation I have at this moment is that the "-2d" graph uses the second RRA, that is actually intended for a week's worth of data. If this is the answer, then how does one influence the relationship between the "-d/w/m/y" settings in the graphing instructions and the intended RRA's? I don't see a connection between the two, so it's probably selected automatically, and most likely (if it's that smart) based on the amount of data points. The first RRA does not have enough so it switches to use the second RRA. Seems plausible, but is this true?

BTW, using "-1w", "-1m" and "-1y" for the other graphs I use, show the same weird results on the tarif "signal.

In the meantime, I have been able to "fix" the graphs without redefining my database by adding a CDEF. Here is what I use for all graphs other than the "1d".

DEF:tarif=$DIR/energy_mon.rrd:tarif:AVERAGE \
CDEF:normaal=tarif,100,GT,200,100,IF \
LINE1:normaal$BLACK_COLOR:"Tarief"

To explain the CDEF, IF the tariff is > 100 (or actually 1=low), set the result to "normaal" which is the high tariff. IF not, set the tariff to "dal" or low.

Your advice to use the background color for tariff is a very good one, and because I can use that on the current database, I have tried that too.

So is your MIN/MAX solution, I thought about that earlier but put that off because that will require setting up a new database and filling that with previous data. (which I have in .csv form, and I have gone through that tedious exercise before - Which is why I have the --start in the rddtool create)

But, I would like to get to the bottom of this issue. Do you or anybody else have any more words of wisdom to share or can verify/explain the working of the RRA <==> DEF relationship or selection?