As answered in Esper sum of snapshots (as opposed to sum of deltas), #unique(field) ensures that the sum is using specified field when summing up the values. That worked great except in the case listed below.
Given the following epl:
create schema StockTick(security string, price double);
create schema OrderEvent (orderId int, trader string, symbol string, strategy string,
quantity double);
select orderId, trader, symbol, strategy, sum(quantity)
from OrderEvent#unique(orderId) as o
full outer join StockTick#lastevent as t
on o.symbol = t.security
group by trader, symbol, strategy
The following events:
StockTick={security='IBM', price=99}
OrderEvent={orderID=1, trader="A", symbol='IBM', strategy="VWAP", quantity=10}
StockTick={security='IBM', price=99}
correctly provides the sum as sum(quantity)=10.0.
But if the StockTick did not happen first:
OrderEvent={orderID=1, trader="A", symbol='IBM', strategy="VWAP", quantity=10}
StockTick={security='IBM', price=99}
I still expected the value of quantity to be 10. But in this case it is sum(quantity)=20.0!!
Based on that #unique(field) does, I figure this shouldn’t be 20. Why is #unique(field) not being adhered to and what is needed in the query for the output to be unique on orderId for both cases?
The following set of events is also available to run on Esper Notebook https://notebook.esperonline.net/#/notebook/2HE9662E6
In a streaming join the aggregation result aggregates all rows produced by the join. Since the join produces 2 rows as output with quantity 10 each the result is 20.
You can take an aggregation of the result of the join and unique by order id, you can use insert into. Like so: