I'm attampting to use a bulk HTTP api in Java on AWS ElasticSearch 2.3. When I use a rest client for teh bulk load, I get the following error:
504 GATEWAY_TIMEOUT
When I run it as Lambda in Java, for HTTP Posts, I get:
{
"errorMessage": "2017-01-09T19:05:32.925Z 8e8164a7-d69e-11e6-8954-f3ac8e70b5be Task timed out after 15.00 seconds"
}
Through testing I noticed the bulk API doesn't work these with these settings:
"number_of_shards" : 5,
"number_of_replicas" : 5
When shards and replicas are set to 1, I can do a bulk load no problem. I have tried using this setting to allow for my bulk load as well:
"refresh_interval" : -1
but so far it made no impact at all. In Java Lambda, I load my data as an InputStream from S3 location. What are my options at this point for Java HTTP? Is there anything else in index settings I could try? Is there anything else in AWS access policy I could try? Thank you for your time.
1Edit:
I also have tried these params: _bulk?action.write_consistency=one&refresh But makes no difference so far.
2Edit:
here is what made my bulk load work - set consistency param (I did NOT need to set refresh_interval):
URIBuilder uriBuilder = new URIBuilder(myuri);
uriBuilder = uriBuilder.addParameter("consistency", "one");
HttpPost post = new HttpPost(uriBuilder.build());
HttpEntity entity = new InputStreamEntity(myInputStream);
post.setEntity(entity);
From my experience, the issue can occur when your index replication settings can not be satisfied by your cluster. This happens either during a network partition, or if you simply set a replication requirement that can not be satisfied by your physical cluster.
In my case, this happens when I apply my production settings (number_of_replicas : 3) to my development cluster (which is single node cluster).
Your two solutions (setting the replica's to 1 Or setting your consistency to 1) resolve this issue because they allow Elastic to continue the bulk index w/o waiting for additional replica's to come online.
Elastic Search probably could have a more intuitive message on failure, maybe they do in Elastic 5.
Setting your cluster to a single