UTF8 - CharsetEncoder#encode() - Infinite Loop

382 views Asked by At

We have a Play 2.2.1 application that's using the plugin jongo to managed MongoDB access and object mapping. This plugin uses a library called bson4jackson to adds support for BSON to the Jackson JSON processor.

We can handle hundreds MB of UTF8 data but randomly, we have a thread that never ends and takes 100% of one thread of the CPU. With the same set of data, the error can occur or not.

Here is the call stack of the thread :

application-akka.actor.default-dispatcher-12 [RUNNABLE]
de.undercouch.bson4jackson.io.DynamicOutputBuffer.putUTF8(int, String)
de.undercouch.bson4jackson.io.DynamicOutputBuffer.putUTF8(String)
de.undercouch.bson4jackson.BsonGenerator._writeCString(String)
de.undercouch.bson4jackson.BsonGenerator._writeString(String)
de.undercouch.bson4jackson.BsonGenerator.writeString(String)
com.fasterxml.jackson.databind.ObjectWriter.writeValue(OutputStream, Object)
org.jongo.marshall.jackson.JacksonEngine.marshall(Object)
org.jongo.Insert.marshallDocument(Object)
org.jongo.Insert.createDBObjectToInsert(Object)
org.jongo.Insert.save(Object)
org.jongo.MongoCollection.save(Object)
models.audits.PageEntityCollection.save(PageEntity)
services.audit.PageBean.finish()
services.audit.PageBean.auditChangeState(AuditState)
services.audit.dispatcher.SchedulerAPI.schedule(PageBean)
services.audit.dispatcher.PageAuditManager.startAudit(String, String, String, String,    String)
controllers.HARReceiver.launchAudit(JsonNode, String)
controllers.HARReceiver$1.run()
akka.dispatch.TaskInvocation.run()
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec()
scala.concurrent.forkjoin.ForkJoinTask.doExec()
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinTask)
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool$WorkQueue)
scala.concurrent.forkjoin.ForkJoinWorkerThread.run()

The code of the method is the following : Link to the class

    public int putUTF8(int pos, String s) {
            ByteBuffer minibb = null;

            CharsetEncoder enc = getUTF8Encoder();
            CharBuffer in = CharBuffer.wrap(s);

            int pos2 = pos;
            ByteBuffer bb = getBuffer(pos2);
            int index = pos2 % _bufferSize;
            bb.position(index);

            while (in.remaining() > 0) {
                    CoderResult res = enc.encode(in, bb, true);

                    //flush minibb first
                    if (bb == minibb) {
                            bb.flip();
                            while (bb.remaining() > 0) {
                                    putByte(pos2, bb.get());
                                    ++pos2;
                            }
                    } else {
                            pos2 += bb.position() - index;
                    }

                    if (res.isOverflow()) {
                            if (bb.remaining() > 0) {
                                    //exceeded buffer boundaries; write to a small temporary buffer
                                    if (minibb == null) {
                                            minibb = ByteBuffer.allocate(4);
                                    }
                                    minibb.rewind();
                                    bb = minibb;
                                    index = 0;
                            } else {
                                    bb = getBuffer(pos2);
                                    index = pos2 % _bufferSize;
                                    bb.position(index);
                            }
                    } else if (res.isError()) {
                            try {
                                    res.throwException();
                            } catch (CharacterCodingException e) {
                                    throw new RuntimeException("Could not encode string", e);
                            }
                    }
            }

            adaptSize(pos2);
            return pos2 - pos;
    }

We have found that the thread is always running and stay on the while (in.remaining() > 0) because we saw the repetitively call to encode() method.

We don't really understand why this happend. We have no major skill on I/O in our team, we would be pleased to get an hint of where the issue comes from, or a method for debugging.

0

There are 0 answers