Does char[] + memcpy() violate strict aliasing?

250 views Asked by At

I'm using a char array in a struct to hold some generic data, like this (the input type may be a struct of unknown size so I can't just use a union; this code is heavily simplified):

typedef struct {
    char buf[256];
} data;

void data_set_int(data *d, int a) {
    memcpy(d->buf, &a, sizeof(a));
}

int data_get_int(data *d) {
    int ret;
    memcpy(&ret, d->buf, sizeof(ret));
    return ret;
}

void data_set_float(data *d, float a) {
    memcpy(d->buf, &a, sizeof(a));
}

float data_get_float(data *d) {
    float ret;
    memcpy(&ret, d->buf, sizeof(ret));
    return ret;
}

int main(void) {
    data d;

    data_set_int(&d, 3);
    int int_result = data_get_int(&d);

    data_set_float(&d, 10.0f);
    float float_result = data_get_float(&d);

    return 0;
}

If I never attempt to write a float and then read the data as an int or vise versa, is this code well-defined in C(99)?

Compiling with GCC yields no warnings, and running the code gives the expected behavior (int_result == 3, float_result == 10.0f). Changing the memcpy to a normal pointer dereference (int ret = *(int *)d->buf) also works fine with no warnings.

All of the sources I've read on strict aliasing say that you can read any type as a char * (so I think that means set should be fine), but you cannot read a char * as any other type (not so sure that get is fine). Have I misunderstood the rule?

1

There are 1 answers

3
supercat On BEST ANSWER

Under C89, the behavior of memcpy was analogous to reading each byte of the source using an unsigned char*, and writing each byte of the destination using an unsigned char*; since character pointers may be used to access anything else, that made memcpy universal for purposes of data conversion.

C99 added a some new restrictions to memcpy which still allow it to be used in cases where the destination object has a declared type, or where the effective type of all non-character pointers that are going to be used to read the destination object are consistent with the effective type of the source, but leave objects without a declared type in a state which is only readable using the source type. I don't think C11 has eased those restriction in any meaningful way.

Your code should not be affected by the memcpy rules since each memcpy operation either writes to an object with a declared type, or writes to storage which will only be red via memcpy to an objecct with a declared type. The main problem situation with C99's memcpy rules occurs when code needs to update objects in place without knowing the type with which they will next be read.

For example, on a system where both int and long have identical 32-bit representations, it should be possible to write a function that can load data into either an int[] or a long[] without having to know which kind of pointer it is receiving (the sequence of machine operations would be the same in either case). If code reads some data into a temporary int[] and then uses memcpy to move it to the final destination, the sequence would be guaranteed by the Standard to work if the destination is an actual declared object of type int[] or long[], or if it is a region of allocated storage that will be read as int[], but would not be guaranteed to work if it is a region of allocated storage that will next be read as long.