Group by date with sparkline like data in the one query

180 views Asked by At

I have the time-series data from the similar hosts that stored in ClickHouse table in the next structure:

 event_type | event_day        
------------|---------------------
 type_1     | 2017-11-09 20:11:28
 type_1     | 2017-11-09 20:11:25
 type_2     | 2017-11-09 20:11:23
 type_2     | 2017-11-09 20:11:21

Each row in the table means the presence of a value 1 for event_type on the datetime. To quickly assess the situation I need to indicate the sum (total) + the last seven values (pulse), like this:

 event_type | day        | total | pulse                       
------------|------------|-------|-----------------------------
 type_1     | 2017-11-09 | 876   | 12,9,23,67,5,34,10          
 type_2     | 2017-11-09 | 11865 | 267,120,234,425,102,230,150 

I tried to get it with one request in the following way, but it failed - the pulse consists of the same values:

with
    arrayMap(x -> today() - 7 + x, range(7)) as week_range,
    arrayMap(x -> count(event_type), week_range) as pulse
select
    event_type,
    toDate(event_date) as day,
    count() as total,
    pulse
from database.table
group by day, event_type
 event_type | day        | total | pulse                       
------------|------------|-------|-------------------------------------------
 type_1     | 2017-11-09 | 876   | 876,876,876,876,876,876,876          
 type_2     | 2017-11-09 | 11865 | 11865,11865,11865,11865,11865,11865,11865 

Please point out where is my mistake and how to get desired?

2

There are 2 answers

1
Denny Crane On BEST ANSWER
select event_type, groupArray(1)(day)[1], arraySum(pulse) total7, groupArray(7)(cnt) pulse
from (
    select
      event_type,
      toDate(event_date) as day,
      count() as cnt
    from database.table
    where day >= today()-30
    group by event_type,day
    order by event_type,day desc 
)
group by event_type 
order by event_type
0
vladimir On

I would consider calculating pulse on the server-side, CH just provides the required data.


Can be used neighbor-window function:

SELECT
    number,
    [neighbor(number, -7), neighbor(number, -6), neighbor(number, -5), neighbor(number, -4), neighbor(number, -3), neighbor(number, -2), neighbor(number, -1)] AS pulse
FROM
(
    SELECT number
    FROM numbers(10, 15)
    ORDER BY number ASC
)

┌─number─┬─pulse──────────────────┐
│     10 │ [0,0,0,0,0,0,0]        │
│     11 │ [0,0,0,0,0,0,10]       │
│     12 │ [0,0,0,0,0,10,11]      │
│     13 │ [0,0,0,0,10,11,12]     │
│     14 │ [0,0,0,10,11,12,13]    │
│     15 │ [0,0,10,11,12,13,14]   │
│     16 │ [0,10,11,12,13,14,15]  │
│     17 │ [10,11,12,13,14,15,16] │
│     18 │ [11,12,13,14,15,16,17] │
│     19 │ [12,13,14,15,16,17,18] │
│     20 │ [13,14,15,16,17,18,19] │
│     21 │ [14,15,16,17,18,19,20] │
│     22 │ [15,16,17,18,19,20,21] │
│     23 │ [16,17,18,19,20,21,22] │
│     24 │ [17,18,19,20,21,22,23] │
└────────┴────────────────────────┘