GCC generates unaligned access for ARMv8

54 views Asked by At

I am setting up a ARMv8 embedded system and stumbled over a hardfault, when trying to compile the following code:

void test_ram()
{
    const uint8_t goldendata[16] = { 0xde, 0xad, 0xba, 0xbe, 0, 0x55, 0x55, 0xaa, 0xaa, 0, 0xff, 0xff };
    (void) goldendata;
}

Interestingly enough I get a hardfault (unaligned access) when the array is initialized. GCC apparently even knows that the access is unaligned, as can be seen on godbolt:

enter image description here

Compiled with GCC 13.2 ARM, options -std=c11 -ggdb -ffunction-sections -O0 -ggdb -ffunction-sections -O0 -Wall -mcpu=cortex-m33 -mthumb -mfpu=fpv5-sp-d16 -mfloat-abi=hard

Question

Why would GCC generate an array initialization which results in a hard fault?

I am not sure if the unaligned access necessitates a hard fault, since I found conflicitng arm documentation (unaligned access always faults vs faults iff UNALIGN_TRP bit is set):

Arm® v8-M Architecture Reference Manual DDI0553B.l B6.4 Alignment behavior

The following are unaligned data accesses that always generate an alignment fault: • Non halfword-aligned LDAH, LDREXH, LDAEXH, STLH, STLEXH, and STREXH. • Non word-aligned LDREX, LDAEX, STLEX, STREX, LDRD, LDMIA, LDMDB, POP (multiple registers), LDC, VLDR, VLDM, VPOP, LDA, STL, STMIA, STMDB, PUSH (mulitple registers), STC, VSTR, VSTM, VPUSG, VLLDM, and VLSTM. Applies to an implementation of the architecture from Armv8.0-M onwards

https://developer.arm.com/documentation/100235/0002/jds1485977091086

Unaligned accesses are usually slower than aligned accesses. In addition, some memory regions might not support unaligned accesses. Therefore, Arm recommends that programmers ensure that accesses are aligned. To trap accidental generation of unaligned accesses, use the UNALIGN_TRP bit in the Configuration and Control Register.

I checked the UNALIGN_TRP bit (bit 3), and it is not set, yet I end up with the fault. (gdb x/x 0xE000ED14 => 0xe000ed14: 0x00000201)

If it makes any difference the CPU used is a CortexM33 in the STM32U5.

Footnote: I found I can suppress the issue either by making the variable static (aka not allocating the array unaligned on the stack) or by passing gcc the -mno-unaligned-access option

1

There are 1 answers

0
ted On

Turns out @Nate Eldredge was right:

Using CubeMX code to enable the MPU was generated, for some reason Cube chose to mark the internal flash and the internal memory as Device. Ammending manual initializaiton code to set the memory as normal memory fixes the issue. Snippet of copy pasta CubeMX code. Simply adding the Macro INNER_OUTER for the attributes remedies the issue. (apply for Flash and RAM, see which region numbers cube used)

HAL_MPU_Disable();
    
MPU_AttributesInit.Number = MPU_REGION_NUMBER1; // pretty sure this should be MPU_ATTRIBUTES_NUMBER1 but since attributes and region number are defined to the same values, we stick with ST's initialisation here...
MPU_AttributesInit.Attributes = INNER_OUTER( MPU_WRITE_BACK | MPU_TRANSIENT | MPU_RW_ALLOCATE );

HAL_MPU_ConfigMemoryAttributes(&MPU_AttributesInit);
HAL_MPU_Enable(MPU_PRIVILEGED_DEFAULT);