Use of __builtin_expected for bounds check

225 views Asked by At

I have this function which, given a Gray code, returns the next Gray code. You can find a more complete explanation about how it works here. The thing is that I wanted to make this increment function modular so that incrementing the Gray code corresponding to UINT_MAX returns the Gray code corresponding to 0 (respectively the most significant bit and 0). Since it is not the default behaviour, I added a check for this special case. Here is the complete algorithm:

unsigned next_gray(unsigned gray)
{
    static const unsigned msb
        = 1u << (CHAR_BITS - sizeof(unsigned) - 1u);

    // gray is odd
    if (__builtin_parity(gray))
    {
        if (__builtin_expect(gray == msb, false))
        {
            return 0u;
        }
        else
        {
            unsigned y = gray & -gray;
            return gray ^ (y << 1u);
        }
    }

    // gray is even
    return gray ^ 1;
}

So, the actual question is actually about branch prediction. I have often read that __builtin_expect is to be used only when a branch is really likely to be chosen or really unlikely to be chosen, the common example being for speeding a program up when there are no errors.

Considering that I am not handling an error case, I am unsure whether using __builtin_expect for bounds checking like this is a good idea at all. Is this a good place to use __builtin_expect or is incrementing the max value a common enough operation to deceive branch prediction?

Note: as always, comments and answers highlight things that are not clear in my questions :)

I will give a bit more context: this function is meant to be part of a library, developed for the sake of being a library and not used by any actual project as of know. Therefore, adding __builtin_expect implies that I expect people to mostly increment other values then the maximum value; without any actual project at hand, I wish to know whether this is a safe assumption.

1

There are 1 answers

1
missimer On

Taken from the GCC online docs:

You may use __builtin_expect to provide the compiler with branch prediction information. In general, you should prefer to use actual profile feedback for this (-fprofile-arcs), as programmers are notoriously bad at predicting how their programs actually perform. However, there are applications in which this data is hard to collect.

Is this a good place to use __builtin_expect or is incrementing the max value a common enough operation to deceive branch prediction?

This all depends on your application. If the value of gray is uniformly distributed then it will be 1 out of (UINT_MAX+1), but can you say that for sure? This is why the docs recommend using -fprofile-arcs.

The gcov wikipedia article actually contains a nice simple example on how to use -fprofile-arcs and gcov to get the information to make an informed decision.

Update:

If you can't profile, then all things being equal the edge case gray == msb is very unlikely so you are probably safe with your use of __builtin_expect. However, if you can't profile because you don't know how your library is going to be used this sounds more like pessimization than optimization. If I use your library and always pass in gray such that it equals msb than your library won't be as fast for me. General libraries that aren't written with a specific application in mind generally try to be good for the average case or without making any assumptions about input. This is why you see different implementations of malloc such as jemalloc and tcmalloc. Both are optimized for very specific use cases and if you use it in a way that isn't like what it was optimized for it won't do as well. Also this blog article might be of some interest to you.