Can Cassandra be used to replicate the data across 1500 sites?

119 views Asked by At

We work with eCommerce client that has a central facility and about 1500 stores. We need to develop a storage that supports centralized data management and replicates the data to the stores, so each store can work autonomously even if it losses connectivity with the central facility.

Cassandra is considered as a storage implementation, but we have no clarity on how replication should be implemented - should native Cassandra's capabilities be used (i.e. one cluster of 1500 nodes) or custom replication between 1500 independent Cassandra deployments. Are there evidences that the native replication mechanism will work at such scale?

It is worth noting that most data is partitioned by stores. For example, each store has its own product catalog. All catalogs should be stored in a central facility, but it is not necessary to replicate all catalogs to every store - a store can receive only its own catalog.

1

There are 1 answers

0
jbellis On

It should work in theory, but to my knowledge nobody has done it yet. (And contra to the commenter above, Cassandra's multi-dc support is considerably more robust than Riak's.)