Java: 32-bit fp implementation of Math.sqrt()

694 views Asked by At

The standard Math.sqrt() method seems pretty fast in Java already, but it has the inherent drawback that it is always going to involve 64-bit operations which does nothing but reduce speed when dealing with 32-bit float values. Is it possible to do better with a custom method that uses a float as a parameter, performs 32-bit operations only, and returns a float as a result?

I saw:

Fast sqrt in Java at the expense of accuracy

and it did little more than reinforce the notion that Math.sqrt() is generally hard-to-beat. I also saw:

http://www.codeproject.com/Articles/69941/Best-Square-Root-Method-Algorithm-Function-Precisi

which showed me a bunch of interesting C++/ASM hacks that I am simply too ignorant to port directly to Java. Though sqrt14 might be interesting as a part of a JNI call . . .

I also looked at Apache Commons FastMath, but it looks like that library defaults to the standard Math.sqrt() so no help there. And then there's Yeppp!:

http://www.yeppp.info/

but I haven't bothered with that yet.

2

There are 2 answers

1
apangin On BEST ANSWER

You need nothing to speed up sqrt for 32-bit values. HotSpot JVM does it automatically for you.

JIT compiler is smart enough to recognize f2d -> Math.sqrt() -> d2f pattern and replace it with faster sqrtss CPU instruction instead of sqrtsd. The source.

The benchmark:

@State(Scope.Benchmark)
public class Sqrt {
    double d = Math.random();
    float f = (float) d;

    @Benchmark
    public double sqrtD() {
        return Math.sqrt(d);
    }

    @Benchmark
    public float sqrtF() {
        return (float) Math.sqrt(f);
    }
}

And the results:

Benchmark    Mode  Cnt       Score      Error   Units
Sqrt.sqrtD  thrpt    5  145501,072 ± 2211,666  ops/ms
Sqrt.sqrtF  thrpt    5  223657,110 ± 2268,735  ops/ms
0
Marcus Müller On

As you seem to know JNI:

just write a minimal wrapper for double sqrt(double) and float sqrt(float) from C's standard library's math.h and compare performance.

Hint: you won't feel a difference unless you do a lot of square rooting, and then the performance advantage of using SIMD instructions to do multiple sqrts at once will most probably dominate the effects. You will need to get a memory-aligned array of the floating point values from Java, which can be quite hard, if you're using Java standard libraries.