Hazelcast (Java) and ETCD (golang) differences/similarities?

4.7k views Asked by At

Now we building a realtime analytics system and it should be highly distributed. We plan to use distributed locks and counters to ensure data consistency, and we need a some kind of distributed map to know which client is connected to which server. I have no prior experience in distributed systems before, but I think we have two options:

  1. Java+Hazelcast

  2. Golang+ETCD

But what is the pros/cons of each other in topic context?

2

There are 2 answers

4
kuujo On BEST ANSWER

Hazelcast and etcd are two very different systems. The reason is the CAP theorem.

The CAP theorem states that no distributed system can have Consistency, Availability, and Partition-tolerance. Distributed systems normally fall closer to CA or CP. Hazelcast is an AP system, and etcd (being a Raft implementation) is CP. So, your choice is between consistency and availability/performance.

In general, Hazelcast will be much more performant and be able to handle more failures than Raft and etcd, but at the cost of potential data loss or consistency issues. The way Hazelcast works is it partitions data and stores pieces of the data on different nodes. So, in a 5 node cluster, the key "foo" may be stored on nodes 1 and 2, and bar may be stored on nodes 3 and 4. You can control the number of nodes to which Hazelcast replicates data via the Hazelcast and map configuration. However, during a network or other failure, there is some risk that you'll see old data or even lose data in Hazelcast.

Alternatively, Raft and etcd is a single-leader highly consistent system that stores data on all nodes. This means it's not ideal for storing large amounts of state. But even during a network failure, etcd can guarantee that your data will remain consistent. In other words, you'll never see old/stale data. But this comes at a cost. CP systems require that a majority of the cluster be alive to operate normally.

The consistency issue may or may not be relevant for basic key-value storage, but it can be extremely relevant to locks. If you're expecting your locks to be consistent across the entire cluster - meaning only one node can hold a lock even during a network or other failure - do not use Hazelcast. Because Hazelcast sacrifices consistency in favor of availability (again, see th CAP theorem), it's entirely possible that a network failure can lead two nodes to believe a lock is free to be acquired.

Alternatively, Raft guarantees that during a network failure only one node will remain the leader of the etcd cluster, and therefore all decisions are made through that one node. This means that etcd can guarantee it has a consistent view of the cluster state at all times and can ensure that something like a lock can only be obtained by a single process.

Really, you need to consider what you are looking for in your database and go seek it out. The use cases for CP and AP data stores are vastly different. If you want consistency for storing small amounts of state, consistent locks, leader elections, and other coordination tools, use a CP system like ZooKeeper or Consul. If you want high availability and performance at the potential cost of consistency, use Hazelcast or Cassandra or Riak.

Source: I am the author of a Raft implementation

0
David Brimley On

Although this question is now over 3 years old, I'd like to inform subsequent readers that Hazelcast as of 3.12 has a CP based subsystem (based on Raft) for its Atomics and Concurrency APIs. There are plans to roll out CP to more of Hazelcast data structures in the near future. Giving Hazelcast users a true choice between AP and CP concerns and allowing users to apply Hazelcast to new use cases previously handled by systems such as etcd and Zookeeper.

You can read more here...

https://hazelcast.com/blog/hazelcast-imdg-3-12-beta-is-released/