In the question Why is it faster to process a sorted array than an unsorted array? the answer had to do with branch predicting being bad with random order data and good with sorted data. BUT the thing is the data is the same, just the ordering was different.
That confused me, I pretty much assumed the prediction is compile in. Its either always assume true or always assume false but apparently its dynamic. Since its dynamic how does it know which to predict? Is there data on it in the cache somewhere? I heard of code execution cache and ram cache, is there data in the cpu cache? is it only L1?
How exactly does branch prediction adjust its prediction and where is its data to make that dynamic decision its doing?
The question you are asking is about branch target prediction, but this doesn’t come into play in this particular example. The only difference here is whether the branch is taken or not.
The code in this case boils down to:
In a loop, so in assembly it is going to look something like:
In the jump less than, the branch predictor has to make a guess as to whether to go to the start of the loop again or fall through. If it gets it right, the code will run faster because the instruction queue will be well fed without bubbles or stalls in the pipeline.
Following the weather forecast principle (tomorrow’s weather will be much the same as today’s) if the CPU can remember which way it went last time, it will be faster. You can think of this as being a single Boolean value that is stored in the implementation of the CPU which records whether or it it went that way the last time, and guessing the same again. After each jump is taken (or not) the value is updated for next time.
If you have a pattern of (eg)
aaaaAAAA
and you’re iterating through on whether it is a capital or not, if you use this simple prediction then you’ll be right all of the time except for theaA
transition. If the data isaAaAaAaA
then you will be wrong all the time.So the pattern of the data causes the taken/not taken record to be either close to what happens or what doesn’t happen. In this case, if you sort the data then all the small values will be at the start of the data, and so the branch predictor will have good success in predicting correctly. If you have random data, then the branch predictor will be less useful.
Modern branch predictors are much more complicated; so this is a simplistic view. But if you think of the branch prediction state being a single Boolean of taken/not taken held in a memory cell as part of the branch predictor logic, it might make sense.