partition key for a Cassandra table?
In customer table customerid is partition key?
Suppose I have 1 million customers in year so I have 1 million partitions
After 10 years so I have 10 million customers or more also ... so I have 10 million paritions
SO my Question is ? 1) if I want read customers table (10 million partition) is that affect the read performance ?
note : In single partition we may have 50 to 100 columns ?
You have the right idea in that you'll want to use data modeling to create a multi-tenant environment. The caveat is that you're not going to want to do full table/multiple partition scans in Cassandra to retrieve that data. It's pretty well documented as to why, but anytime you have a highly distributed environment, you will want to minimize the amount of network hops, data shuffling, etc. Can't fight physics :)
Anyways, it sounds like this is reporting type of use case - you're going to need to use something like Spark or some type of map and reduce to efficiently report on multiple partitions like this.