I'm using InfluxData stack for anomaly detection in time series data, using InfluxDB and Kapacitor.
I collected some open source samples and set the following tick script for detecting anomalies:
batch
.query('select mean(value) from "nycTaxi"."default"."nycTaxi"')
.period(1h)
.every(2h)
.groupBy(time(1h))
.mapReduce(influxql.percentile('mean', 90.0))
.eval(lambda: sigma("percentile"))
.as('sigma')
.keep('percentile', 'sigma')
.alert()
.warn(lambda: "sigma" > 2.0)
.log('/path/alerts.log')
.crit(lambda: "sigma" > 3.0)
.log('/path/alerts.log')
Obtaining alerts like the following:
{"id":"nycTaxi:nil",
"message":"nycTaxi:nil is WARNING",
"time":"2016-09-13T14:43:21.892057062Z",
"level":"WARNING",
"data":{
"series":[
{
"name":"nycTaxi",
"columns":[
"time",
"percentile",
"sigma"
],
"values":[
[
"2016-09-13T14:43:21.892057062Z",
1279,
2.002345963142575
]]}]}}
To record the data I used this line kapacitor record batch -start 2014-07-01T00:00:00Z -stop 2015-02-31T00:00:00Z -name nyc
For some reason Kapacitor interprets the time as a 2016 date when in the DB the oldest date is 2015-01-31. Why does this happen?
I posted an issue in the Kapacitor repo and the solution to my problem was to use the following line for replaying the data
kapacitor replay -id RECORDING_ID -name nyc -fast -rec-time
. The key here is the flag-rec-time
which solved the issue.Kudos to Nathanielc that solved the issue.