fastest way to map a large number of longs

595 views Asked by At

I'm writing a java application that transforms numbers (long) into a small set of result objects. This mapping process is very critical to the app's performance as it is needed very often.

public static Object computeResult(long input) {
    Object result;
    // ... calculate
    return result;
}

There are about 150,000,000 different key objects, and about 3,000 distinct values. The transformation from the input number (long) to the output (immutable object) can be computed by my algorithm with a speed of 4,000,000 transformations per second. (using 4 threads)

I would like to cache the mapping of the 150M different possible inputs to make the translation even faster but i found some difficulties creating such a cache:

public class Cache {
    private static long[] sortedInputs; // 150M length
    private static Object[] results; // 150M length

    public static Object lookupCachedResult(long input) {
        int index = Arrays.binarySearch(sortedInputs, input);
        return results[index];
    }
}

i tried to create two arrays with a length of 150M. the first array holds all possible input longs, and it is sorted numerically. the second array holds a reference to one of the 3000 distinct, precalculated result objects at the index corresponding to the first array's input.

to get to the cached result, i do a binary search for the input number on the first array. the cached result is then looked up in the second array at the same index.

sadly, this cache method is not faster than computing the results. not even half, only about 1.5M lookups per second. (also using 4 threads)

Can anyone think of a faster way to cache results in such a scenario?

I doubt there is a database engine that is able to answer more than 4,000,000 queries per second on, let's say an average workstation.

2

There are 2 answers

4
stewori On

Hashing is the way to go here, but I would avoid using HashMap, as it only works with objects, i.e. must build a Long each time you insert a long, which can slow it down. Maybe this performance issue is not significant due to JIT, but I would recommend at least to try the following and measure performance against the HashMap-variant:

Save your longs in a long-array of some length n > 3000 and do the hashing by hand via a very simple hash-function (and thus efficient) like index = key % n. Since you know your 3000 possible values before hand you can empirically find an array-length n such that this trivial hash-function won't cause collisions. So you circumvent rehashing etc. and have true O(1)-performance.

Secondly I would recommend you to look at Java-numerical libraries like

Both are backed by native Lapack and BLAS implementations that are usually highly optimized by very smart people. Maybe you can formulate your algorithm in terms of matrix/vector-algebra such that it computes the whole long-array at one time (or chunk-wise).

0
maaartinus On

There are about 150,000,000 different key objects, and about 3,000 distinct values.

With the few values, you should ensure that they get re-used (unless they're pretty small objects). For this an Interner is perfect (though you can run your own).

i tried hashmap and treemap, both attempts ended in an outOfMemoryError.

There's a huge memory overhead for both of them. And there isn't much point is using a TreeMap as it uses a sort of binary search which you've already tried.

There are at least three implementations of a long-to-object-map available, google for "primitive collections". This should use slightly more memory than your two arrays. With hashing being usually O(1) (let's ignore the worst case as there's no reason for it to happen, is it?) and much better memory locality, it'll beat(*) your binary search by a factor of 20. You binary search needs log2(150e6), i.e., about 27 steps and hashing may need on the average maybe two. This depends on how tightly you pack the hash table; this is usually a parameter given when it gets created.

In case you run your own (which you most probably shouldn't), I'd suggest to use an array of size 1 << 28, i.e., 268435456 entries, so that you can use bitwise operations for indexing.


(*) Such predictions are hard, but I'm sure it's worth trying.