IllegalArgumentException when populating a ChronicleMap with high variability in value size

1k views Asked by At

A while back, I asked this question about a ChronicleMap being used as a Map<String,Set<Integer>>. Basically, we have a collection where the average Set<Integer> might be 400, but the max length is 20,000. With ChronicleMap 2, this was causing a rather vicious JVM crash. I moved to ChronicleMap 3.9.1 and have begun to get an exception now (at least it's not a JVM crash):

java.lang.IllegalArgumentException: Entry is too large: requires 23045 chucks, 6328 is maximum.
    at net.openhft.chronicle.map.impl.CompiledMapQueryContext.allocReturnCode(CompiledMapQueryContext.java:1760)
    at net.openhft.chronicle.map.impl.CompiledMapQueryContext.allocReturnCodeGuarded(CompiledMapQueryContext.java:120)
    at net.openhft.chronicle.map.impl.CompiledMapQueryContext.alloc(CompiledMapQueryContext.java:3006)
    at net.openhft.chronicle.map.impl.CompiledMapQueryContext.initEntryAndKey(CompiledMapQueryContext.java:3436)
    at net.openhft.chronicle.map.impl.CompiledMapQueryContext.putEntry(CompiledMapQueryContext.java:3891)
    at net.openhft.chronicle.map.impl.CompiledMapQueryContext.doInsert(CompiledMapQueryContext.java:4080)
    at net.openhft.chronicle.map.MapEntryOperations.insert(MapEntryOperations.java:156)
    at net.openhft.chronicle.map.impl.CompiledMapQueryContext.insert(CompiledMapQueryContext.java:4051)
    at net.openhft.chronicle.map.MapMethods.put(MapMethods.java:88)
    at net.openhft.chronicle.map.VanillaChronicleMap.put(VanillaChronicleMap.java:552)

I suspect this is still because I have values that are far outliers to the mean. I assume ChronicleMap determined the maximum number of chunks to be 6328 based on the average value I gave the builder, but didn't expect there to be a gigantic value which needed 23045 chunks.

So my question is: what's the best way to go about solving this? Some approaches I'm considering, but still not sure on:

  1. Use ChronicleMapBuilder.maxChunksPerEntry or ChronicleMapBuilder.actualChunkSize. That said, how do I deterministically figure out what those should be set to? Also, this will probably lead to a lot of fragmentation and slower performance if it's set too high, right?
  2. Have a "max collection size" and split the very large collections into many smaller ones, setting the key accordingly. For example, if my key is XYZ which yields a Set<Integer> of size 10000, perhaps I could split that into 5 keys XYZ:1, XYZ:2, etc. each with a set of size 2000. This feels like a hack around something I could just configure in ChronicleMap though, and results in a lot of code that feels like it shouldn't be necessary. I had this same plan mentioned in my other question, too.

Other thoughts/ideas are appreciated!

1

There are 1 answers

2
leventov On BEST ANSWER

If you don't specify maxChunksPerEntry() manually, the maximum size of entry is limited with the segment tier size, in chunks. So what you need is to make segment tier size larger. The first thing you can try to do is configuring actualSegments(1), if you are not going to access the map from multiple threads within the JVM concurrently. You have additional control over those configurations via ChronicleMapBuilder.actualChunkSize(), actualChunksPerSegmentTier() and entriesPerSegment().

By default ChronicleMapBuilder chooses the chunk size between 1/8 and 1/4 of the configured average value size. So if your segment tier size is 6328 chunks, your segment(s) is configured to contain about 1000 entries. If your average value set size has 400 elements and the maximum is 20,000, the difference between average and max should be about 50 times, but from the stack trace it looks like one of your entries is well more than 2000 times larger than the average. Probably you have not accounted something.

Also for such big values I suggest to develop and use more memory efficient value serializer, because the default one will generate a lot of garbage. E. g. it could use a primitive IntSet which implements Set<Integer> from fastutil or Koloboke or Koloboke Compile libraries.

Also I suggest to use the latest version available now, Chronicle Map 3.9.1 is already outdated.