Am I aggregating data from Aerospike efficiently?

68 views Asked by At

In Aerospike I have a set ac_1_2015-06-13_15 which contains the spending information of an account 1 on 2015-06-13 broken down by 15 minute segments, that is, every record represents a 15 minute segment within the day. Since there are 4 15-minute segments in an hour and 24 hours in a day, there are 96 records. Every record has a single bin spend.

To calculate the total spend in the day I used AerospikeClient.scanAll() summing up all the spend values:

totalSpend += record.getDouble("spend");

This takes 351 milliseconds. Is there a more efficient way to calculate the sum or this is it?

1

There are 1 answers

0
Ronen Botzer On

In general with key-value stores, you'd want to do such aggregations in-place on a single record, if possible, and minimize the numbers of records being used to model the information.

If your set describes a single day, is there a reason why an account isn't a single record? it could have a bin for each hour, with the data type of the bin being a list or a map holding the segments. Other bins can hold aggregated data.

I would also be careful with having a set per day. There's a 1023 sets per-namespace limit. Is there a reason the day isn't an integer or string with a secondary index built over it? This way you could query for the day's data. Or if it's a unix-timestamp value (secondary index over an integer) you can query for precise ranges of time. A record per-day with all down to the minute data in it (assuming limits) would make more sense to me.