Exact sum of a long array

405 views Asked by At

In order to get the exact sum of a long[] I'm using the following snippet.

public static BigInteger sum(long[] a) {
    long low = 0;
    long high = 0;
    for (final long x : a) {
        low += (x & 0xFFFF_FFFFL);
        high += (x >> 32);
    }
    return BigInteger.valueOf(high).shiftLeft(32).add(BigInteger.valueOf(low));
}

It works fine by processing the numbers split in two halves and finally combining the partial sums. Surprisingly, this method works too:

public static BigInteger fastestSum(long[] a) {
    long low = 0;
    long high = 0;
    for (final long x : a) {
        low += x;
        high += (x >> 32);
    }
    // We know that low has the lowest 64 bits of the exact sum.
    // We also know that BigInteger.valueOf(high).shiftLeft(32) differs from the exact sum by less than 2**63.
    // So the upper half of high is off by at most one.
    high >>= 32;
    if (low < 0) ++high; // Surprisingly, this is enough to fix it.
    return BigInteger.valueOf(high).shiftLeft(64).add(BigInteger.valueOf(low));
}

I don't believe that the fastestSum should work as is. I believe that it can work, but that something more has to be done in the final step. However, it passes all my tests (including large random tests). So I'm asking: Can someone prove that it works or find a counterexample?

2

There are 2 answers

4
ZhongYu On BEST ANSWER
fastestSum(new long[]{+1, -1})  => -18446744073709551616
0
maaartinus On

This seems to work. Given that my tests missed the problem with my trivial version, I'm not sure if it's correct. Whoever wants to analyze this is welcome:

public static BigInteger fastestSum(long[] a) {
    long low = 0;
    long control = 0;
    for (final long x : a) {
        low += x;
        control += (x >> 32);
    }
    /*
     We know that low has the lowest 64 bits of the exact sum.
     We also know that 2**64 * control differs from the exact sum by less than 2**63.
     It can't be bigger than the exact sum as the signed shift always rounds towards negative infinity.
     So the upper half of control is either right or must be incremented by one.
     */
    final long x = control & 0xFFFF_FFFFL;
    final long y = (low >> 32);
    long high = (control >> 32);
    if (x - y > 1L << 31) ++high;
    return BigInteger.valueOf(high).shiftLeft(64).add(BigInteger.valueOf(low));
}

It's maybe 30% faster than the sane version.