For HashMap<Integer,Integer>, after inserting it with 10000000 unique random values. I perform get(), using the hashmap's keySet(), like in the following code snippet:
HashMap<Integer, Integer> hashmap =
new HashMap<Integer, Integer>(10000000, 0.99f);
// ... Code to put unique 10000000 associations into the hashmap ...
int iteration = 100;
long startTime, totalTime = 0;
while(iteration > 0) {
for(Integer key: hashmap.keySet()) {
startTime = System.currentTimeMillis();
hashmap.get(key);
totalTime += (System.currentTimeMillis() - startTime);
}
iteration--;
}
System.out.println(totalTime/100 + " ms");
Running the above code, I get: 225 ms
Now, if I change the above code to use a set instead, like in following snippet:
Set<Integer> set = new HashSet<Integer>(hashmap.keySet());
while(iteration > 0) {
for(Integer key: set) {
startTime = System.currentTimeMillis();
hashmap.get(key);
totalTime += (System.currentTimeMillis() - startTime);
}
iteration--;
}
System.out.println(totalTime/100 + " ms");
And after running this code, I get: 414 ms
Why such difference in performance?
P.S.: I used following JVM arguments:
-Xms2048m -Xmx4096m -XX:MaxPermSize=256m
When you read a large data structure (larger than 32 KB) how you read that data structure impact performance.
These are the typical sizes and speeds of you caches.
This means cache locality is very important. i.e. if you are reading something from L1 it can be 20x faster than reading from L3.
In your case you are using a Hash data structure. This is designed for random access and random arrangement which unfortunately mean it has very poor cacheability. Access memory randomly and it is likely to be in the slower areas of memory.
However, there is an exception to this. If you access the same data multiple times, e.g. get a key you just got from an iterator, or you are scanning through a collection in order e.g. this is what the iterator does for HashMap (not so much a TreeMap) it is much more likely that the next piece of data you will access is on the same cache line (each cache line is 64-bytes long) or the next one. These type of accesses perform much better as CPU are designed to perform vector operations very fast.
BTW Your working set is the set of keys, if your values are different objects I would expect this to be much slower if you actually look at those objects (as this increases the size of your working set and how much memory would be required to cache it)