Riak db returns 500 Internal server error

331 views Asked by At

I seem to be having an issue with our riak-kv db v2.2.3, we are currently running 5 node cluster, out of nowhere last week suddenly requests to our db started returning 500 internal server error for all/almost every request (also might be valuable to mention our db has nearly doubled in size over the past 2 weeks.) At first, we thought the issue was in the code making the requests however after sshing into one of the nodes in the cluster and attempting a simple request to list buckets we saw this:

Command:

curl -i http://localhost:8098/buckets?buckets=true

Response:

HTTP/1.1 500 Internal Server Error
Vary: Accept-Encoding
Server: MochiWeb/1.1 WebMachine/1.10.9 (cafe not found)
Date: Sun, 10 Dec 2017 06:21:20 GMT
Content-Type: text/html
Content-Length: 1193

<html><head><title>500 Internal Server Error</title></head><body>
<h1>Internal Server Error</h1>The server encountered an error while 
processing this request:<br><pre>{error,
{error,
    {badmatch,{error,mailbox_overload}},
    [{riak_kv_wm_buckets,produce_bucket_list,2,
         [{file,"src/riak_kv_wm_buckets.erl"},{line,225}]},
     {webmachine_resource,resource_call,3,
         [{file,"src/webmachine_resource.erl"},{line,186}]},
     {webmachine_resource,do,3,
         [{file,"src/webmachine_resource.erl"},{line,142}]},
     {webmachine_decision_core,resource_call,1,
         [{file,"src/webmachine_decision_core.erl"},{line,48}]},
     {webmachine_decision_core,decision,1,
         [{file,"src/webmachine_decision_core.erl"},{line,562}]},
     {webmachine_decision_core,handle_request,2,
         [{file,"src/webmachine_decision_core.erl"},{line,33}]},
     {webmachine_mochiweb,loop,2,
         [{file,"src/webmachine_mochiweb.erl"},{line,72}]},
     {mochiweb_http,headers,5,
         [{file,"src/mochiweb_http.erl"},{line,105}]}]}}</pre><P><HR>
<ADDRESS>mochiweb+webmachine web server</ADDRESS></body></html>

After some more investigation into the issue, I pulled the logs on one of the riak nodes and saw this:

2017-12-10 03:54:58.654 [error]     
<0.24342.271>@yz_solrq_helper:send_solr_ops_for_entries:301 Updating a     
batch of Solr operations failed for index <<"attachment">> with error 
{error,{other,{ok,"500",[{"Content-Type","application/json; 
charset=UTF-8"},{"Transfer-Encoding","chunked"}],<<"
{\"responseHeader\":{\"status\":500,\"QTime\":1}, error": {
"msg ": "Exception writing document id 1*default*Attachment*07d8e24-dc32s-11e7-q9640-a7f8b4edb446*623 to the index; possible analysis error.",
"trace": ""org.apache.solr.common.So lrException: Exception writing document id 1*default*Attachment*07d8e24-dc32s-11e7-q9640-a7f8b4edb446*623 to the index; possible analysis error.
at org.apache.solr.update. DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:169)
at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
at or g.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
at org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd (DistributedUpdateProcessor.java:952)
at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:692)
at org.apache.so lr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:141)
at org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLo ader.java:106)
at org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:68)
at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java :99)
at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at org.apache.solr.handler.RequestHandlerBase.handleRequ est(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1976)
at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilte r.java:777)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:418)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFi lter.java:207)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(Servle tHandler.java:455)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHand ler.java:557)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(Contex tHandler.java:1075)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHa ndler.java:193)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedH andler.java:135)
at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at org.eclipse.jetty.server.handler.HandlerCo llection.handle(HandlerCollection.java:154)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.han dle(Server.java:368)
at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
at org.eclipse.jetty.server.BlockingHttpConnec tion.handleRequest(BlockingHttpConnection.java:53)
at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953)
at org.eclipse.jetty.s erver.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861)
at org.ecli pse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
at org.eclips e.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java    :608)
 at org.eclip..." >>
}
}
}

After some googling some people said this error is Caused by: java.lang.OutOfMemoryError: Java heap space

So I modified my riak config from:

search.solr.jvm_options = -d64 -Xms4g -Xmx4g -XX:+UseStringCache -XX:+UseCompressedOops

To:

search.solr.jvm_options = -d64 -Xms8g -Xmx8g -XX:+UseStringCache -XX:+UseCompressedOops

However I am still getting the same error any help is much appreciated.

0

There are 0 answers