Bug casting from bool* to void* to int*

355 views Asked by At

Often in C++, one has a parameter void* user_data that one can use to pass an arbitrary type.

I used this to pass an array of booleans. However, I had a bug where I cast from bool* --> void* --> int* and I got weird results. Here is an example.

#include <iostream>

int main() {
    bool test[2] = { };
    void *ptr = static_cast<void*>(test);
    std::cout << static_cast<bool*>(ptr)[0] << '\n';
    std::cout << static_cast<int*>(ptr)[0] << '\n';
    std::cout << static_cast<int>(test[0]) << '\n';
}

Output:

$ g++ int_bool.cpp 
$ ./a.out 
0
-620756992
0

Can someone explain to me what the problem is? Normally when I cast from bool to int, there is no problem: false maps to 0 and true maps to 1. Clearly, that's not the case here.

1

There are 1 answers

8
phuclv On BEST ANSWER

static_cast<int*>(ptr)[0] casts ptr to int* and reads the first element. Since the original array is only 2 bytes, you're reading outside it (because you're reading a 4-byte int) and invokes undefined behavior, unless int is a 2-byte type on your system. You're also violating the strict aliasing rule by accessing a type using a different pointer type which also invokes UB. Besides you'll get UB if the bool array isn't properly aligned. On x86 it doesn't cause any problems because x86 allows unaligned access by default but you'll get a segfault on most other architectures

static_cast<int>(test[0]) OTOH converts test[0] (which is a bool) to int and is a completely valid value conversion.


Update:

The type int* refers to a pointer whose object is 4-bytes long, whereas bool* refers to a pointer whose object is 2-bytes long

No. When dereferencing a variable var, an amount of memory of length sizeof(var) will be read from memory starting from that address and treat as the value of that variable. So *bool_ptr will read 1 byte and *int_ptr will read 4 bytes from memory (if bool and int are 1 and 4-byte types respectively)

In your case the bool array contains 2 bytes, so when 4 bytes is read from static_cast<int*>(ptr), 2 byte inside the array and 2 bytes outside the array are read. If you declared bool test[4] = {}; (or more elements) you'll see that the int* dereferencing completes successfully because it reads all 4 bools that belong to you, but you still suffer from the unalignment issue

Now try changing the bool values to nonzero and see

bool test[4] = { true, false, true, false };

You'll quickly realize that casting a pointer to a different pointer type isn't a simple read in the old type and convert to the new type like a simple value conversion (i.e. a cast) but a different "memory treatment". This is essentially just a reinterpret_cast which you can read to understand more about this problem

I don't understand what you are saying about char*. You're saying casting from any type to char* is valid?

Casting from any other pointer types to char* is valid. Read the question about strict aliasing rule above:

You can use char* for aliasing instead of your system's word. The rules allow an exception for char* (including signed char and unsigned char). It's always assumed that char* aliases other types.

It's used for things like memcpy where you copy the bytes representing a type to a different destination

bool test[4] = { true, true, true, true };
int v;
memcpy((char*)&test, (char*)&v, sizeof v);

Technically mempcy receives void*, the cast to char* is just used for demonstration

See also