How to trigger the pre-load of Hazelcast NearCache?

819 views Asked by At

I understand that the NearCache gets loaded only after first get operation is performed on that key on the IMap. But I am interested in knowing if there is any way to trigger the pre-load of the NearCache with all the entries from its cluster.

Use Case:
The key is a simple bean object and the value is a DAO object of type TIntHashMap containing lot of entries.

Size:
The size of value object ranges from 0.1MB to 24MB (and >90% of the entries have less than 5MB). The number of entries range from 150-250 in the IMap.

Benchmarks:
The first call to the get operation is taking 2-3 seconds and later calls are taking <10 ms.

Right now I have created below routine which reads the IMap and reads each entries to refresh the NearCache.

long startTime = System.currentTimeMillis();

IMap<Object, Object> map = client.getMap("utility-cache");

log.info("Connected to the Cache cluster. Starting the NearCache refresh.");

int i = 0;
for (Object key : map.keySet()) {
    Object value = map.get(key);

    if(log.isTraceEnabled()){
        SizeOf sizeOfKey = new SizeOf(key);
        SizeOf sizeOfValue = new SizeOf(value);
        log.info(String.format("Size of %s Key(%s) Object = %s MB - Size of %s Value Object = %s MB", key.getClass().getSimpleName(), key.toString(),
                sizeOfKey.sizeInMB(), value.getClass().getSimpleName(), sizeOfValue.sizeInMB()));
    }

    i++;
}

log.info("Refreshed NearCache with " + i + " Entries in " + (System.currentTimeMillis() - startTime) + " ms");
1

There are 1 answers

3
Donnerbart On BEST ANSWER

As you said, the Near Cache gets populated on get() calls on IMap or JCache data structures. At the moment there is no system to automatically preload any data.

For efficiency you can use getAll() which will get the data in batches. This should improve the performance of your own preloading functionality. You can vary your batch sizes until you find the optimum for your use case.

With Hazelcast 3.8 there will be a Near Cache preloader feature, which will store the keys in the Near Cache on disk. When the Hazelcast client is restarted the previous data set will be pre-fetched to re-populate the previous hot data set in the Near Cache as fast as possible (only the keys are stored, the data is fetched again from the cluster). So this won't help for the first deployment, but for all following restarts. Maybe this is already what you are looking for?

You can test the feature in the 3.8-EA or the recent 3.8-SNAPSHOT version. The documentation for the configuration can be found here: http://docs.hazelcast.org/docs/latest-dev/manual/html-single/index.html#configuring-near-cache

Please be aware that we changed the configuration parameter from file-name to filename between EA and the actual SNAPSHOT. I recommend the SNAPSHOT version, since we also made some other improvements in the preloader code.