I read the question Why is it faster to process a sorted array than an unsorted array? today and trying to replicate the result in Java.
Here's my code:
public class BranchPredictorTest {
static int[] getBranchCounts1(List<Integer> list) {
long t0 = System.currentTimeMillis();
int[] counts = new int[3];
for(Integer v : list)
if(v < 1000)
counts[0]++;
else if(v < 2000)
counts[1]++;
else
counts[2]++;
System.out.println("[time cost]\t" + (System.currentTimeMillis() - t0) + " ms");
return counts;
}
static int[] getBranchCounts2(List<Integer> list) {
long t0 = System.currentTimeMillis();
int[] counts = new int[3];
for(Integer v : list)
counts[v/1000]++;
System.out.println("[time cost]\t" + (System.currentTimeMillis() - t0) + " ms");
return counts;
}
static void test(List<Integer> list) {
int[] bc1 = getBranchCounts1(list);
System.out.println(String.format("%d, %d, %d", bc1[0], bc1[1], bc1[2]));
int[] bc2 = getBranchCounts2(list);
System.out.println(String.format("%d, %d, %d", bc2[0], bc2[1], bc2[2]));
}
public static void main(String[] args) {
Random rand = new Random();
List<Integer> list = new ArrayList<Integer>();
for(int i=0; i<2e7; i++) {
list.add(rand.nextInt(3000));
}
test(list);
Collections.sort(list);
test(list);
}
}
You can see the 1st function is implemented with branches and the 2nd without branches. I ran both functions for sorted and unsorted array. Based on what I read for Branch Prediction, I expected the time cost to be:
sorted branch < sorted un-branch < unsorted un-branch < unsorted branch
But the result is a bit strange:
[time cost] 145 ms
6670864, 6671007, 6658129
[time cost] 65 ms
6670864, 6671007, 6658129
[time cost] 332 ms
6670864, 6671007, 6658129
[time cost] 1852 ms
6670864, 6671007, 6658129
The sorted branching time cost is larger than the unsorted ones, and after sorted the arraylist, the time cost of un-branch counting increased its runtime dramatically...
EDIT: I tried the same logic in C++, and the result basically meets the expectation:
[time cost] 248 ms
[time cost] 142 ms
[time cost] 87 ms
[time cost] 143 ms
I hope anyone can give me an explanation of how that happens, thanks for viewing/answering my question.
I would guess that the (odd) results you are seeing are due to the fact that you are running them without warming up the JVM.
Try running the above in a loop of say 1,000+ cycles and look at the 95 percentile or so.