I have been working on a project which involves massive updates to elasticsearch and I found that when updating applied to one single doc at a high frequency, consistence can not be guaranteed.
For each update, this is how we do(scala code). Notice that, we have to explicitly remove origin fields and replace it with new one because 'merge' is not we want(_update is merge in fact in elasticsearch).
def replaceFields(alarmId: String, newFields: Map[String, Any]): Future[BulkResponse] = {
def removeField(fieldName: String): UpdateDefinition = {
log.info("script: " + s"""ctx._source.remove("${fieldName}")""")
update id alarmId in IndexType script s"""ctx._source.remove("${fieldName}")"""
}
client.execute {
bulk(
{newFields.toList.map(ele => removeField(ele._1)) :+
{update id alarmId in IndexType doc (newFields)}} : _*
)
}}
It cannot. You can increase the write quorum level to all (see Undestanding the write_consistency and quorum rule of Elasticsearch for some discussion around this; also see the docs https://www.elastic.co/guide/en/elasticsearch/reference/2.4/docs-index_.html#index-consistency) and that would get you closer. But Elasticsearch does not have any linearizability guarantees (eg https://aphyr.com/posts/317-jepsen-elasticsearch for examples and https://aphyr.com/posts/313-strong-consistency-models for definitions) and it's not difficult to cook up scenarios in which ES will not be consistent.
That being said, it tends to be consistent most of the time. But in a high update environment, you're gonna be putting a lot of GC pressure on your JVM to clean out the old docs. I assume you know how updates work under the hood in ES but, in case you are not, it's also worth paying attention to https://www.elastic.co/guide/en/elasticsearch/reference/current/_updating_documents.html