I have a list of events that occur in a system. My goal is to take the list of events and create a sliding window of the series to determine rate event occurrences. The events are loaded into the events list from an application outside of this scope of the issue.
Because the system can receive events from multiple sources at the same time, some of the event occurrence timestamps (the value I am using as a key for the series) are the same. What is the proper way to achieve this?
This is the error I get:
An unhandled exception of type 'System.ArgumentException' occurred in Deedle.dll
Additional information: Duplicate key '6/12/2015 3:14:43 AM'. Duplicate keys are not allowed in the index.
My code:
let mutable events = new ResizeArray<StreamEvent>()
let getSeries =
let eventsKvp = events |> Seq.map(fun(event) -> new KeyValuePair<DateTime,StreamEvent>(event.OccuredAt,event))
let series = Series(eventsKvp)
series |> Series.windowDist (TimeSpan(0, 0, 0,30))
Update #1
What isn't depicted here is some C# code which instantiates some of the F# Stream objects and adds events via the Stream.ProcessEvent method. That code is unimportant to the issue I am experiencing here.
I am no longer getting the duplicate key issue, but am getting a Additional information: Floating window aggregation and chunking is only supported on ordered indices.
error.
Update #2 I needed to use sortByKey instead of sort.
Here is my F# code:
namespace Storck.Data
open System
open System.Collections.Generic
open Deedle
type EventType =
| ClientConnected
| ClientDisconnect
type Edge(id:string,streamId:string) =
member this.Id = id
member this.StreamId = streamId
member this.Edges = new ResizeArray<Edge>()
type StreamEvent(id:string,originStreamId:string,eventType:EventType,ocurredAt:DateTime) =
member this.Id = id
member this.Origin = originStreamId
member this.EventType = eventType
member this.OccuredAt = ocurredAt
override this.Equals(o) =
match o with
| :? StreamEvent as sc -> this.Id = sc.Id
| _ -> false
override this.GetHashCode() =
id.GetHashCode()
interface System.IComparable with
member this.CompareTo(o) =
match o with
| :? StreamEvent as sc -> compare this.Id sc.Id
| _ -> -1
type Client(id:string) =
member this.Id=id
type Key =
| Key of DateTime * string
static member (-) (Key(a, _), Key(b, _)) = a - b
override x.ToString() = let (Key(d, s)) = x in d.ToString() + ", " + s
type Stream(id:string, origin:string) =
let mutable clients = new ResizeArray<Client>()
let mutable events = new ResizeArray<StreamEvent>()
member this.Events = clients.AsReadOnly()
member this.Clients = clients.AsReadOnly()
member this.Id = id
member this.Origin = origin
member this.Edges = new ResizeArray<Edge>()
member this.ProcessEvent(client:Client,event:StreamEvent) =
match event.EventType with
|EventType.ClientConnected ->
events.Add(event)
clients.Add(client)
true
|EventType.ClientDisconnect ->
events.Add(event)
let clientToRemove = clients |> Seq.find(fun(f)-> f.Id = client.Id)
clients.Remove(clientToRemove)
member this.GetSeries() =
let ts = series [ for e in events -> Key(e.OccuredAt, e.Id) => e ]
ts |> Series.sortByKey |> Series.windowDist (TimeSpan(0, 0, 0,30))
One of the design decisions we made in Deedle is that a series can be treated as a continuous series (rather than a sequence of events) and so Deedle does not allow duplicate keys (which make sense for events but not for time series).
I wish there was a nicer support for things like your scenario - it is something we are thinking about for the next version, but I'm not sure how to best do this.
As Fyodor suggests in the comments, you can use unique index that consists of the date together with something (either source or just an ordinal index).
If you define the
-
operator on your key, then you can even use thewindowDist
function:Now we can create a bunch of sample events:
Here, I'll use built-in
series
function with the Deedle=>
operator to create series that maps the keys to values:And we can even use the
windowDist
function because the key type supports-
!