sum counts across multiple days and group by hour

3.5k views Asked by At

I am trying to group records by hour across multiple days using Knex. So, for example, 9AM would be:

{
  hour: 9AM, // for days 12/11, 12/12, 12/13
  count: 10 // 10 Ids total over those days for hour 9AM
}

Given the records in this snapshot, how can I aggregate them into hour buckets across multiple days?

enter image description here

If I output the query results, you can see two separate results for 19:00 for 12/12 and 12/13. These two days' counts need to be summed into one hour 19:00 grouping:

ROWS [ anonymous {
    session_ids: [ 3200 ],
    hour: 2016-12-12T14:00:00.000Z,
    count: '1' },
  anonymous {
    session_ids: [ 3201 ],
    hour: 2016-12-12T15:00:00.000Z,
    count: '1' },
  anonymous {
    session_ids: [ 3203, 3202 ],
    hour: 2016-12-12T19:00:00.000Z,
    count: '2' },
  anonymous {
    session_ids: [ 3204, 3205 ],
    hour: 2016-12-13T19:00:00.000Z, // This count should be aggregated into the `19:00` grouping above
    count: '2' } ]

My current query:

var qry = db.knex
  .select(db.knex.raw("array_agg(t2.id) as session_ids, date_trunc('hour', t2.start_timestamp) as hour"))
  .count('*')
  .from('sessions as t2')
  .groupByRaw("date_trunc('hour', t2.start_timestamp)")
  .orderBy(db.knex.raw("date_trunc('hour', t2.start_timestamp)"));
1

There are 1 answers

3
Marth On BEST ANSWER

Use EXTRACT, not date_trunc:

var qry = db.knex
  .select(db.knex.raw("array_agg(t2.id) as session_ids, extract('hour' from t2.start_timestamp) as hour"))
  .count('*')
  .from('sessions as t2')
  .groupByRaw("extract('hour' from t2.start_timestamp)")
  .orderBy(db.knex.raw("extract('hour' from t2.start_timestamp)"));

date_trunc truncates a timestamp to the specified precision (meaning the GROUP BY won't work, as the days of two timestamps with the same 'hour' field may still be different):

SELECT date_trunc('hour', NOW());
┌────────────────────────┐
│       date_trunc       │
├────────────────────────┤
│ 2016-12-18 19:00:00+01 │
└────────────────────────┘
(1 row)

while EXTRACT fetches the specific field you asked for:

SELECT extract('hour' from NOW());
┌───────────┐
│ date_part │
├───────────┤
│        19 │
└───────────┘
(1 row)