Kapacitor is taking a wrong time format

1.5k views Asked by At

I'm using InfluxData stack for anomaly detection in time series data, using InfluxDB and Kapacitor.

I collected some open source samples and set the following tick script for detecting anomalies:

batch
    .query('select mean(value) from "nycTaxi"."default"."nycTaxi"')
        .period(1h)
        .every(2h)
        .groupBy(time(1h))
.mapReduce(influxql.percentile('mean', 90.0))
    .eval(lambda: sigma("percentile"))
        .as('sigma')
        .keep('percentile', 'sigma')
    .alert()
        .warn(lambda: "sigma" > 2.0)
        .log('/path/alerts.log')
        .crit(lambda: "sigma" > 3.0)
        .log('/path/alerts.log')

Obtaining alerts like the following:

 {"id":"nycTaxi:nil",
  "message":"nycTaxi:nil is WARNING",
  "time":"2016-09-13T14:43:21.892057062Z",
  "level":"WARNING",
  "data":{  
    "series":[  
      {  
        "name":"nycTaxi",
        "columns":[  
          "time",
          "percentile",
          "sigma"
        ],
        "values":[  
          [  
            "2016-09-13T14:43:21.892057062Z",
            1279,
            2.002345963142575
  ]]}]}}

To record the data I used this line kapacitor record batch -start 2014-07-01T00:00:00Z -stop 2015-02-31T00:00:00Z -name nyc

For some reason Kapacitor interprets the time as a 2016 date when in the DB the oldest date is 2015-01-31. Why does this happen?

2

There are 2 answers

0
Martin Aparicio Pons On BEST ANSWER

I posted an issue in the Kapacitor repo and the solution to my problem was to use the following line for replaying the data kapacitor replay -id RECORDING_ID -name nyc -fast -rec-time. The key here is the flag -rec-time which solved the issue.

Kudos to Nathanielc that solved the issue.

3
tkit On

InfluxDb feeds Kapacitor with data kind of in real-time (it's not really intended to go backwards through all your historical data, it was meant as in-time analysis/alerting tool).

Your current query basically just looks at the most recent data (1h) so that's why you're seeing 2016 in there. That is by design. If you want to check for anomalies in your historical data, you will have to write a small program (for example using an InfluxDb library for the language of your choice) which will go through all your old data hour-by-hour, fetch it and from there analyze it. You could also perhaps use backfills for this.