Dealing with data serialization without violating the strict aliasing rule

793 views Asked by At

Often in embedded programming (but not limited to) there is a need to serialize some arbitrary struct in order to send it over some communication channel or write to some memory.

Example

Let's consider a structure composed of different data types in a N-aligned memory region:

struct
{
    float a;
    uint8_t b;
    uint32_t c;
} s; 

Now let's assume we have a library function

void write_to_eeprom(uint32_t *data, uint32_t len);

which is taking the pointer to data to be written as a uint32_t*. Now we would like to write s to the eeprom using this function. A naive approach would be to do something like

write_to_eeprom((uint32_t*)&s, sizeof(s)/4);

But it is a clear violation of the strict aliasing rule.

Second example

struct
{
    uint32_t a;
    uint8_t b;
    uint32_t c;
} s; 

In this case the aliasing (uint32_t*)&s is not violating the rule, as the pointer is compatible with the pointer to the first field type, which is legal. But! The library function can be implemented such that it is doing some pointer arithmetic to iterate the input data, while this arithmetic resulting pointers are incompatible with the data they are pointing to (for example data+1 is the pointer of type uint32_t*, but it might point to the uint8_t field). Which again a violation of the rule, as I understand it.

Possible solution?

Wrap the problematic structure in a union with array of the desired type:

union 
{
    struct_type s;
    uint32_t array[sizeof(struct_type) / 4];
} u;

And pass the u.array to the library function.

Is this the right way to do this? Is this the only right way to do this? What could be some other approaches?

2

There are 2 answers

6
AudioBubble On

Just a note I am not entirely sure but it can be that it is not always safe to cast uint8_t* to char*(here).

Regardless, what does the last parameter of your write function want, number of bytes to write - or number of uint32_t elements? Let's assume later, and also assume you want to write each member of the struct to separate integer. You can do this:

uint32_t dest[4] = {0};
memcpy(buffer, &s.a, sizeof(float));
memcpy(buffer+1, &s.b, sizeof(uint8_t));
memcpy(buffer+2, &s.c, sizeof(uint32_t));

write_to_eeprom(buffer, 3 /* Nr of elements */);

If you want to copy the structure elements to the integer array consecutively - you can first copy the structure member to a byte array consecutively - and then copy the byte array to the uint32_t array. And also pass number of bytes as last parameter which would be - sizeof(float)+sizeof(uint8_t)+sizeof(uint32_t)

0
chux - Reinstate Monica On

Consider that writing to eeprom is often slower, sometimes a lot slower, than writing to normal memory, that using an intervening buffer is rarely a performance drag. I realize this goes against this comment, yet I feel it deserves consideration as it handles all other C concerns

Write a helper function that has no alignment, aliasing nor size issues

extern void write_to_eeprom(/* I'd expect const */ uint32_t *data, uint32_t len);

// Adjust N per system needs
#define BYTES_TO_EEPROM_N 16

void write_bytes_to_eeprom(const void *ptr, size_t size) {
  const unsigned char *byte_ptr = ptr;
  union {
    uint32_t data32[BYTES_TO_EEPROM_N / sizeof (uint32_t)];
    unsigned char data8[BYTES_TO_EEPROM_N];
  } u;

  while (size >= BYTES_TO_EEPROM_N) {
    memcpy(u.data8, byte_ptr, BYTES_TO_EEPROM_N);  // **
    byte_ptr += BYTES_TO_EEPROM_N;
    write_to_eeprom(u.data32, BYTES_TO_EEPROM_N / sizeof (uint32_t));
    size -= BYTES_TO_EEPROM_N;
  }

  if (size > 0) {
    memcpy(u.data8, byte_ptr, size);
    while (size % sizeof (uint32_t)) {
      u.data8[size++] = 0;  // zero fill
    }
    write_to_eeprom(u.data32, (uint32_t) size);
  }
}

// usage - very simple
write_bytes_to_eeprom(&s, sizeof s);

** Could use memcpy(u.data32, byte_ptr, BYTES_TO_EEPROM_N); to handle @zwol issue.