I need to get a list of N users with the most recent actions using cassandra.
I have tried to use the following schema:
CREATE TABLE user_actions(
partition_key int,
username int,
action_time timestamp,
PRIMARY KEY(partition_key, action_time)
WITH CLUSTERING ORDER BY(action_time DESC);
When a user first logs in it creates a new entry. However, I'm unable to update the start_time
property as it's used as a clustering key.
I could delete the old data and create a new entry when the user takes another action - but user actions occur frequently and we will quickly hit the tombstone limit.
Does anyone know how best I could model the data in order to achieve this?
You dont need to delete individual cells. Just keep appending. Add another component to the partition key to keep from getting too wide:
Then query using something like (pseudo code):
to get the last N users. When theres not N in the partition, decrement day by 1 and query that too. Set TTL to 1 to 5 days or something and old records will be compacted away. This assumes you have at least N actions in 24 hours, or else you may get a not full list. If your records are super active can use
hour
instead of day.The day/hour component can just be a simple
floor(time() / (60*60*24))