Does Duff's Device Speed up Java Code?

2.6k views Asked by At

Using the stock Sun 1.6 compiler and JRE/JIT, is it a good idea to use the sort of extensive unroll exemplified by Duff's Device to unroll a loop? Or does it end up as code obfuscation with no performance benefit?

The Java profiling tools I've used are less informative about line-by-line CPU usage than, say, valgrind, so I was looking to augment measurement with other people's experience.

Note that, of course, you can't exactly code Duff's Device, but you can do the basic unroll, and that's what I'm wondering about.

        short stateType = data.getShort(ptr);
        switch (stateType) {

        case SEARCH_TYPE_DISPATCH + 16:
            if (c > data.getChar(ptr + (3 << 16) - 4)) {
                ptr += 3 << 16;
            }
        case SEARCH_TYPE_DISPATCH + 15:
            if (c > data.getChar(ptr + (3 << 15) - 4)) {
                ptr += 3 << 15;
            }
         ...

down through many other values.

2

There are 2 answers

2
Matthew Flaschen On BEST ANSWER

It doesn't much matter whether it's a good idea (it's not), because it won't compile.

EDIT: This is mentioned explicitly in the JLS:

A trick known as Duff's device can be used in C or C++ to unroll the loop, but this is not valid code in the Java programming language:

Or, more bluntly (from the same section):

Great C hack, Tom, but it's not valid here.

EDIT: To answer your more (too) general question, usually no. You should generally rely on the JIT.

1
polygenelubricants On

You are ignoring the fact that Java compiles to bytecodes for a stack-oriented virtual machine. Whatever low-level optimization trick you attempt at the Java level is largely ineffective. The real optimization happens when the JIT compiler produces the assembly for the target architecture, a process that you can neither control nor care about for the most part.

You should instead optimize at a much larger picture. Let the JIT compiler handle the low-level optimizations.