How to cast char array to int at non-aligned position?

1.8k views Asked by At

Is there a way in C/C++ to cast a char array to an int at any position?

I tried the following, bit it automatically aligns to the nearest 32 bits (on a 32 bit architecture) if I try to use pointer arithmetic with non-const offsets:

unsigned char data[8];
data[0] = 0; data[1] = 1; ... data[7] = 7;
int32_t p = 3;
int32_t d1 = *((int*)(data+3));  // = 0x03040506  CORRECT
int32_t d2 = *((int*)(data+p));  // = 0x00010203  WRONG

Update:

  • As stated in the comments the input comes in tuples of 3 and I cannot change that.
  • I want to convert 3 values to an int for further processing and this conversion should be as fast as possible.
  • The solution does not have to be cross platform. I am working with a very specific compiler and processor, so it can be assumed that it is a 32 bit architecture with big endian.
  • The lowest byte of the result does not matter to me (see above).

My main questions at the moment are: Why has d1 the correct value but d2 does not? Is this also true for other compilers? Can this behavior be changed?

2

There are 2 answers

3
Bathsheba On BEST ANSWER

No you can't do that in a portable way.

The behaviour encountered when attempting a cast from char* to int* is undefined in both C and C++ (possibly for the very reasons that you've spotted: ints are possibly aligned on 4 byte boundaries and data is, of course, contiguous.)

(The fact that data+3 works but data+p doesn't is possibly due to to compile time vs. runtime evaluation.)

Also note that the signed-ness of char is not specified in either C or C++ so you should use signed char or unsigned char if you're writing code like this.

Your best bet is to use bitwise shift operators (>> and <<) and logical | and & to absorb char values into an int. Also consider using int32_tin case you build to targets with 16 or 64 bit ints.

13
mafso On

There is no way, converting a pointer to a wrongly aligned one is undefined.

You can use memcpy to copy the char array into an int32_t.

int32_t d = 0;
memcpy(&d, data+3, 4); // assuming sizeof(int) is 4

Most compilers have built-in functions for memcpy with a constant size argument, so it's likely that this won't produce any runtime overhead.

Even though a cast like you've shown is allowed for correctly aligned pointers, dereferencing such a pointer is a violation of strict aliasing. An object with an effective type of char[] must not be accessed through an lvalue of type int.

In general, type-punning is endianness-dependent, and converting a char array representing RGB colours is probably easier to do in an endianness-agnostic way, something like

int32_t d = (int32_t)data[2] << 16 | (int32_t)data[1] << 8 | data[0];