Elasticsearch and CAP Theorem

5.3k views Asked by At

Elasticsearch is a distributed system. As per the CAP theorem, it can satisfy any 2 out of 3 properties. Which one is compromised in Elasticsearch?

4

There are 4 answers

1
vermaji On BEST ANSWER

I strongly disagree with Harshit, Elasticsearch compromises on availability as he also mentioned few requests are returned error due to unavailability of shards.

ES guarantees consistency - as data read/write are always consistent. guarantees ES gaurantees Partition tolerance - if any node which was partitioned, joined back to the cluster after some time, it is able to recover the missed data to the current state.

Moreover, there is no distributed system that gives up on Partition Tolerance, cause without a guaranty of PT distributed system can't exist.

0
Puneet Garg On

Elasticsearch's is AP. It has Eventual Consistency for reads because you may be querying a replica which doesnt have the data replicated yet.

Writes always go to primary Shard Primary then replicates to each secondary/replcated shard ( in parallel)

  • ARS (Adaptive Replica Selection) sends the query to best available shard
  • All read queries are not sent to Primary shard
  • Req are distributed to replica shards from replication group.

So your query may end up on shard which doesnt have the latest update of the data.

1
Harshit On

CAP theorem states that a distributed system can have at most two of the following:

  1. Consistency.
  2. Availability.
  3. Partition Tolerance.

Elasticsearch gives up on "Partition Tolerance"

Reason: It means that if the creation of the node fails, the cluster health will turn red and it will not proceed to operate on the newly created index.

It will not give up on "Availability" because every Elasticsearch query will be returning a response from cluster either true (results) / false (error).

It will not give up on "Consistency" either. If it gives up on consistency then there will not be any document versioning and no index recovery.

You read more here: https://discuss.elastic.co/t/elasticsearch-and-the-cap-theorem/15102/8

0
Varun Garg On

The answer is not so straightforward. It depends upon how the system is configured and how you want to use it. I will try to go into the details.

Paritioning in ElasticSearch

  1. Each index is partitioned in shards, meaning data in each shard is mutually exclusive to other shards. Each shard further has multiple Lucence indices, which are not in the scope of this answer.
  2. Each shard can have a replica running (most setups have) and in an event of a failure, the replica can be promoted to a master. Let's call a shard that has a primary working and is reachable from the ES node that our application server is hitting as an Active shard. Thus, a shard with no copies in primary and is not reachable is considered as failed shard. (Eg: An error saying "all shards failed" means no primaries in that index are available)
  3. ES has a feature to have multiple primaries (divergent shards). It is not a good situation as we lose both read/write consistencies.

In an event of a network partition, what will happen to:

  1. Reads:

    1. By default reads will continue to happen on shards that are active. Thus, the data from failed shards will be excluded from our search queries. In this context, we consider the system to be AP. However, the situation is temporary and does not require manual effort to synchronize shard when the cluster is connected again.
    2. By setting a search option allow_partial_search_results [1] to false, we can force the system to error when some of the shards have failed, guaranteeing consistent results. In this context, we consider the system to be CP.
    3. In case no primaries are reachable from the node(s) that our application server is connecting to, the system will completely fail. Even if we say that our partition tolerance has failed, we also see that availability has taken a hit. This situation can be called be just C or CP
    4. There can be cases where the team has to anyhow bring up the shards and their out of sync replica(s) were reachable. So they decide to make it a primary (manually). Note that there can be some un-synced data resulting in divergent shards. This results in the AP situation. Consistency will be hard to restore when the situation normalizes (sync shards manually)
  2. Writes

    1. Only if all shards fail, writes will stop working. But even if one shard is active writes will work and are consistent (by default). This will be CP
    2. However, we can set option index-wait-for-active-shards [2] as all to ensure writes only happen when all shards in the index are active. I only see a little advantage of the flag, which would be to keep all shards balanced at any cost. This will be still CP (but lesser availability than the previous case)
    3. Like in the last read network partition case, if we make un-synced replicas as primary (manually) there can be some data loss and divergent shards. The situation will be AP here and consistency will be hard to restore when the situation normalizes (sync shards manually)

Based on the above, you can make a more informed decision and tweak ElasticSearch according to your requirements.

References:

  1. https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html
  2. https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#index-wait-for-active-shards