Cassandra - get the latest value within group

27 views Asked by At

Let's say I have the following table which possibly contains dozens of millions of rows and thousands of guids:

| Guid   | Measurement time | Signal |
| ------ | ---------------- | ------ |
| xxx1   | now              | 10     |
| xxx2   | now              | 5      |
| xxx1   | now-1h           | 9      |
| xxx2   | now-1h           | 6      |

The table DDL:

create table sensor_data_v3
(
    guid             text,
    measurement_time bigint,
    signal           float,
    ... some more columns ...
    primary key (guid, measurement_time)
)
    with clustering order by (measurement_time desc)
     and caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
     and compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
     and compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
     and dclocal_read_repair_chance = 0.1
     and speculative_retry = '99PERCENTILE'
     and read_repair_chance = 0;

I need to get the latest signal for each guid, how do I do that?

Cassandra version is 3.11 if I'm not mistaken.

1

There are 1 answers

0
Vitaly Chura On

After reading this answer - Cassandra - How group by latest timestamp it seems just I just had to run the following:

select guid, measurement_time, signal from sensor_data_v3 group by guid;