Branch prediction and UB (undefined behavior)

327 views Asked by At

I know a little something about branch prediction. This happens at the CPU and has nothing to do with compilation. Although you might be able to tell the compiler if one branch is more likely than the other e.g. in C++20 via [[likely]] and [[unlikely]] (see cppreference) this is seperate to the branch prediction the CPU performs (see Can I improve branch prediction with my code?).

As far as I know when I have e.g. a loop (with an exit condition) the CPU will predict that the exit condition will not be met and try to perform some operations inside the loop even if the condition has not been checked yet. If the CPU predicts correct it saves some time and everything is fine. However what happens if it is not able to predict it correct? I know that this will be a performance hit but I don't know if some already done operations are discareded or reversed or just how it handles it.

Now I have come up with two simple examples. The first one (if we ignore that the compiler might just calculate the sum at compile time and I asume that no optimisations occure) should be very easy to predict for the CPU. The loop condition will be the same for the whole time and the condition in the loop only switches once. This means that the prediction will get us a nice performance boost and even if it fails a few times, adding a number can easily be reversed.

In the second example the exit condition is again easy to predict. In the loops body I allocate a new int array via malloc. Note that I don't free it on purpose since I want the allocation to succeed for a long time so the CPU predicts this success. At some time the allocation will fail when I run out of memory (I didn't calculate the total memory consumption and suppose that memory will not be moved to disk) or some other error occurs. This means that ptr will be NULL and dereferencing it is UB. It is not defined what happens, it could just be a no-op, crash my program or cause my PC to fly away. Therefore I conclude that the CPU can not just undo that and I am wondering what happens.

#include <stdlib.h>

#define VERSION 1

#if VERSION == 1
int main() {
    size_t sum = 0ull;

    for (size_t i = 0ull, max = 1'000ull; i < max;  ++i) {
        if (i < (max / 2)) {
            sum += 2 * i;
        }
        else {
            sum += i;
        }
    }

    return 0;
}

#else
int main() {
    int* ptr = NULL;

    for (size_t i = 0ull, max = 1'000'000ull; i < max; ++i) {
        ptr = (int*)malloc((sizeof * ptr) * 1'000ull);

        if (ptr) {
            *ptr = 1234;
        }

        // free(ptr)
    }

    return 0;
}
#endif

Branch prediction is the CPUs task and UB obviously exists in C as well as in C++ so I think an answer to this doesn't require one specific language and my code should work in both languages. If the chosen language however makes a difference I am more interested in C++ than in C but would be happy for any answers.

4

There are 4 answers

0
Erik Eidt On BEST ANSWER

However what happens if it is not able to predict it correct?

It wastes time & energy doing work that has to be thrown away.

This means that ptr will be NULL and dereferencing it is UB.

No, the language doesn't work that way.  The compiler must honor the guard (if-statement) around that dereference.

The compiler must honor the C++ language, full stop!  If the compiler generates a speculative load of the null pointer (possible on some ISAs like Itanium), that must be conditional and ignorable, because the program explicitly said not to.

Meanwhile, the hardware must honor the ISA, period!  If the hardware generates a speculative load of the null pointer, that also must be conditional and ignorable, because the machine code program explicitly said not to.

via [[likely]] and [[unlikely]] … this is seperate to the branch prediction the CPU performs

Code path hints in the C++ language do not necessarily translate into branch prediction hints in machine code.

Many ISAs do not have (or their implementations do not use) machine code branch direction hints.  This is because hardware branch prediction has gotten so good that it is done very early and doesn't need the hint.  In order to use the hint, the instruction has to decoded, which happens later than we'd like in processor stages for prediction.

What the C++ compiler can do with those hints though is rearrange the machine code so that the likely path is straight and contiguous and the unlikely paths are relocated elsewhere, out of the way of the straight path.

4
Max Meijer On

The idea of speculative execution is that it is hidden from you as a programmer. If you want to get insipration to possible ways this might be implemented you could for example look at how they do speculative execution in BOOM.

The C++ action of accessing a nullpointer will probably map to something like trying to access memory at an invalid address in the machine code. If this were to happen a TRAP would occur, but if it happens speculatively I suspect speculative execution would wait for the branch to be confirmed before issuing the trap.

The Boom docs say the following about misspeculation:

If a branch (or jump) is misspeculated, the Branch Unit must redirect the PC to the correct target, kill the Front-end and Fetch Buffer, and broadcast the misspeculated branch tag so that all dependent, inflight UOPs<Micro-Op (UOP) may be killed. The PC redirect signal goes out immediately, to decrease the misprediction penalty. However, the kill signal is delayed a cycle for critical path reasons.

5
0___________ On

Branch prediction does not have anything to do with UB.

UB is C or C++ language concept abstracted from the actual implementation. It can be only analyzed on the source code level. If there is a UB in your code then the compiler is basically free to do what it wants as the standard does not specify what should happen in this case

If your source code does not invoke UB, then the compiler has to emit the code which (when executed) will have the same observable behaviour on all platforms.

in C++20 via [[likely]] and [[unlikely]] (see preference) this is seperate to the branch prediction the CPU performs

It existed long before C++20 as compiler extensions (for example GCC __builtin_expect) and is only a hint to the compiler to help it better understand your program flow. In "normal" programming it is a very rarely used feature and you should use it only in very specific cases when it can significantly improve the performance (for example writing the low-level parts of OS kernel or fast device drivers)

I would rather suggest focusing on the language itself (understanding the concepts) rather than esoteric implementation details.

0
vinc17 On

Undefined behavior is a concept of programming language only. The CPU needs to execute the program as written in assembly code (e.g. generated by a compiler). However, the full definition of the expected behavior is not clear at all. For instance, due to speculative execution and branch misprediction, the CPU may do things that are not expressed by the assembly code, which are not visible in the results, but whose effects are observable via timings. This is what led to vulnerabilities such as Spectre.