Packing a union of structs

2.9k views Asked by At

I would like to create an array of different structs with different sizes.

The resulting array must be tightly packed with no null values between structs.

The whole thing must be initialised at compile time, so it can reside in the flash of an embedded system.

The result is a tree of USB configuration descriptors, every descriptor packed in immediately after the last to produce a single configuration blob. Suggestions of different approaches to the problem would be welcomed. http://www.beyondlogic.org/usbnutshell/usb5.shtml#ConfigurationDescriptors

struct a {
    uint16_t some_field;
};
struct b {
    uint32_t another_field;
};
union detail {
    struct a a;
    struct b b;
};
const union detail configuration[] = {
    { .a = { .some_field = 23 } },
    { .b = { .another_field = 12 } }
};

The above example is a significantly simplified version of my current, failing, attempt. Each element of the array is the size of the largest union member. So every array member is 32 bits and the first entry is padded with zeros.

Current output 1700 0000 0c00 0000

Desired output 1700 0c00 0000

Existing methods to generate this packed output use a giant uint8 array with macros to insert more complex values such as 16 bit numbers. An array of structs more accurately represents the data and provides type safety, if it would work.

I don't need to be able to index or access the data from the array, the blob is shoved in to low level USB routines. Playing with the gcc packed attribute did not change the standard union array behaviour.

3

There are 3 answers

0
lod On BEST ANSWER

Accepting comments from @Basile-Starynkevitch, @Jonathan-Leffler and others that what I was hoping for couldn't be done I reconsidered. What I really required was to precisely control the relative placement of the structures in memory/flash. The placement is done with the linker and I eventually found a solution there.

First, inside the SECTIONS portion of the linker script I created a special block. The only way to ensure order is to create multiple sections and manually order them, cpack0-3 in this instance.

.text : ALIGN(4) /* Align the start of the block */
{
    *(.cpack0) *(.cpack1) *(.cpack2) *(.cpack3)
} > MFlash32

Then the struct variables are slotted in to the special sections. The long handed syntax repetitive can be simplified by #define elements in a real implementation.

const struct a configuration __attribute__((section(".cpack0"), aligned(1))) = {
    .some_field = 23
};

const struct b configuration1 __attribute__((section(".cpack1"), aligned(1))) = {
    .another_field = 12
};

So we have a configuration variable, aligned at a 4 byte address for nice access and defined using a struct for type safety. The subsequent portions of the configuration are also defined by structs for safety and placed in memory sequentially. The aligned(1) attribute ensures that they are packed in tightly with no empty space.

This solves my problem, the configuration definition is done via a struct for all the advantages is provides, the ugliness is hidden by a #define and the final configuration is a binary blob of variable length accessed by a uint8_t* pointer. As the pointer increments it moves seamlessly across the different configuration elements.

0
Basile Starynkevitch On

I would like to create an array of different structs with different sizes.

That is simply not possible in C (and for good reasons). An array (in C) is made of components of the same size (and type). If that was not the case, indexed access to element of that array would be a very complex and time consuming operation (which is against the spirit of C; however in C++ you might define your own operator []).

You could instead have an array of char-s (e.g. const char data[] = {0x35, 0x27, 0}; etc; perhaps that big array of bytes could be generated by some ad-hoc script emitting some C code initializing a large array) and have some parsing routine to process it. Or you could have an array of pointers:

union detail {
  struct a* aptr;
  struct b* bptr;
};

static const struct a firstelem= {.some_field= 35};
static const struct b secondelem= {.another_field= 12};
const union detail configuration[] = {
  {.aptr= &firstelem},
  {.bptr= &secondelem},
};

Notice that in your case having an array of pointers is actually giving a bigger data.

0
Lundin On

You shouldn't use union for this, it doesn't do what you think it does. If you want an array of structs, where each struct may be of different type, it can't be done. Instead you will have to define a "super struct" containing all the other structs, in the correct order.

However, this doesn't solve the problem with alignment/padding. In order to disable padding in structs (and unions), you must resort to non-standard C. A common non-standard extension is #pragma pack. The gcc compiler also supports a non-standard attribute "packed", see What is the meaning of “attribute((packed, aligned(4))) ”. Since code that disables padding is non-standard, it is also non-portable.

It is also possible to solve the problem by creating an array of uint8_t and then read/write chunks of data into this array. This is known as serialization/de-serialization of data. Conversions from any pointer type to uint8_t* or character types is safe, but unfortunately, going the other way around invokes undefined behavior. This is because of a bug in the C language often referred to as "the strict aliasing rule", which sometimes makes it impossible to use the C language in smooth or meaningful ways when doing hardware-related programming such as this.

The work-around for this C language bug is to write a gigantic union with 2 elements, one which is the uint8_t array, one which is a "super struct" like the one described above. You won't actually use the super struct - you probably can't because of padding - but by putting it in a union you invoke a special exception to strict aliasing. Meaning there will no longer be undefined behavior and you will prevent aggressive optimizing compilers like gcc from breaking your code.

Another gcc-specific work-around for this C language bug is to compile with gcc -fno-strict-aliasing. Embedded systems compilers typically work better than gcc for this case, since they don't follow the C standard, but instead make pointer conversions behave deterministically in a non-standard manner. For example, on such compilers code like (uint16_t*)my_uint8t deterministically treats the pointed-at data as uint16_t, rather than silently causing your program to crash and burn.