Granularity level in clustering key( high unique values)

161 views Asked by At

I am little new to cassandra data modelling. I am trying to understand if i can have high unique values in clustering key. for eg: we have 4 columns. Storeid, shipping_status, orderid and guestname. We have approximately 3000 stores, 4 status type and high orderids each day. We need to query on storeid , status and sometimes orderids. So I am trying to keep storeid and status as partition key and orderid as clustering key. So my question is can i keep such a lowest granularity level column in clustering key. orderid will have huge unique ids each day. Also will there be any problem if i add guestname too in clustering key. tnx for your suggestions.

1

There are 1 answers

0
Cedric H. On

Using storeid and shipping_status as parts of the partition key and then using orderid as a clustering key makes the situation very similar to time series data.

Cassandra is well suited to store things with that data model (aka "wide rows" in pre-CQL terms) and the limit is set on 2x10E9 (2 billions) values of the clustering key per partition.

So you should not go for "open-ended" partitions, but use chunking: you could have a partition key which is storeid + status + year is the volume of orders per year is much less than 2x10E9, or storeid + status + year + month if you're Amazon.

To answer your second question, no, there is no problem to have tables where all the columns are part of the primary key.