Let's say I have huge LinkedHashMap, I need to remove elements from the list in parallel based on some predicate.
Filling map
Map<Long, String> map = new LinkedHashMap<>();
for (long i = 0; i < Integer.MAX_VALUE / 1000; i++) {
StringBuilder sb = new StringBuilder();
for (int j = 0; j < 10; j++) {
sb.append(String.valueOf(sb.append(UUID.randomUUID().toString())));
}
map.put(i, sb.toString());
}
Trying to remove element using single thread loop.
Iterator<Entry<Long, String>> it = maps.entrySet().iterator();
while (it.hasNext()) {
if (it.next().getValue().indexOf(SEARCH_STR) != -1) {
it.remove();
}
}
Trying to do it in parallel (Just an example which works slowly)
List<Future<?>> futures = new ArrayList<>();
long to = 0, from = 0, size = map.size();
do {
to += 550000;
to = Math.min(to, size);
long _from = from, _to = to;
Future<?> future = executor.submit(()-> {
for (Entry<Long, String> entry : map.subMap(_from, _to).entrySet()) {
if (entry.getValue().indexOf(SEARCH_STR) == -1) {
synchronized (result) {
result.put(entry.getKey(), entry.getValue());
}
}
}
});
from = to;
futures.add(future);
} while (to < size);
for (Future<?> future : futures) {
future.get();
}
Can I achieve better performance using Java 8 ForkJoinPool
or ParallelStream
? Code below simply goes OOM
Map<Long, String> result = originalMap.entrySet().parallelStream()
.filter((k) -> k.getValue().indexOf(SEARCH_STR) != -1)
.collect(Collectors.toMap((k) -> k.getKey(), (v) -> v.getValue()
));