Kusto query difference of an "Odometer" value over specified timespan with rollovers to 0

116 views Asked by At

I've got a bit of a potentially unusual requirement for a KQL query. It boils down to finding the difference of the first instance to the last instance of a value within a specified timespan.

Say we got the following schema with value being incremented at random times:

"TableName": OdometerData,
"Schema": timestamp:datetime, value:real

Querying this is simple enough with for example:

OdometerData
| where timestamp between (ago(7d) .. now())
| summarize diff=(max(value) - min(value))

The issue is that this value field will rollover and reset to 0 when it reaches a predetermined value. This means that the query above would not return the proper value if a rollover happened during the span of the query. Correct result would be to get a value that "continues" the value after the rollover instead of resetting to 0 and summarizing the difference.

My question is if there is some built-in functionality or perhaps a way to use a materialized view to keep a value that will continue to increment independent of the value rollover? Or would this be better solved with pre-processing of the data before writing it into the database?

Thanks

1

There are 1 answers

1
Gyp the Cat On BEST ANSWER

This is indeed an interesting scenario. I don't believe there is anything built-in which would fix this entire use but since it's KQL there is usually at least one way of doing it.

Personally I'd approach this with windows functions. Namely serialise, next and row_cumsum which does the incrementation you're asking for in your question. Also add in a case statement for some logic.

An interesting possibly edge case I've commented in the code below is what happens if the value is the same across two sequential value pairs? Do you assume it's gone full circle or assume it's not changed?

Since you also mention pre-processing before this data gets ingested that could absolutely be another sensible option too.

let RollOverPoint = 100; //When does our data rollover
let OdometerData = range timestamp from startofweek(ago(28d)) to endofday(now()) step 1h //Generate some data
| extend value = rand(RollOverPoint); //Assuming Odometer rolls over at x and flips to 0
OdometerData
| where timestamp between (ago(7d) .. now())
| order by timestamp asc
| serialize 
| extend NextValue = next(value, 1)
| extend SumAmount = case (
  NextValue == value, 0, //Deal with identical sequences
  NextValue > value, toint(NextValue - value),
  NextValue < value, toint(((RollOverPoint) - value) + NextValue),
  0
  )
| serialize TotalOdometer = row_cumsum(SumAmount)
| project timestamp, value, TotalOdometer
//| summarize arg_min(StartTime = timestamp, LowestValue = value), arg_max(EndTime = timestamp, TotalValue = TotalOdometer)
timestamp value TotalOdometer
2023-10-25T19:00:00.0000000Z 20.0 80
2023-10-25T20:00:00.0000000Z 0.0 102
2023-10-25T21:00:00.0000000Z 22.0 156
2023-10-25T22:00:00.0000000Z 76.0 188
2023-10-25T23:00:00.0000000Z 8.0 249
2023-10-26T00:00:00.0000000Z 69.0 338
2023-10-26T01:00:00.0000000Z 58.0 372
2023-10-26T02:00:00.0000000Z 92.0 380
2023-10-26T03:00:00.0000000Z 0.0 435
2023-10-26T04:00:00.0000000Z 55.0 500
etc etc etc

Or example output with the summarize above.

StartTime LowestValue EndTime TotalValue
2023-10-25T20:00:00.0000000Z 61 2023-11-01T19:00:00.0000000Z 8337