Using the stock Sun 1.6 compiler and JRE/JIT, is it a good idea to use the sort of extensive unroll exemplified by Duff's Device to unroll a loop? Or does it end up as code obfuscation with no performance benefit?
The Java profiling tools I've used are less informative about line-by-line CPU usage than, say, valgrind, so I was looking to augment measurement with other people's experience.
Note that, of course, you can't exactly code Duff's Device, but you can do the basic unroll, and that's what I'm wondering about.
short stateType = data.getShort(ptr);
switch (stateType) {
case SEARCH_TYPE_DISPATCH + 16:
if (c > data.getChar(ptr + (3 << 16) - 4)) {
ptr += 3 << 16;
}
case SEARCH_TYPE_DISPATCH + 15:
if (c > data.getChar(ptr + (3 << 15) - 4)) {
ptr += 3 << 15;
}
...
down through many other values.
It doesn't much matter whether it's a good idea (it's not), because it won't compile.
EDIT: This is mentioned explicitly in the JLS:
Or, more bluntly (from the same section):
EDIT: To answer your more (too) general question, usually no. You should generally rely on the JIT.