I'll start with the question, and will follow it by an example. The description of this flag in ARM Compiler armclang Reference Guide Version 6.4 (link) says:
If unaligned access is disabled, words in packed data structures are accessed one byte at a time.
As you can see in the following example, after the 1 byte access on line 1e0 there is (aligned) word access on line 1e2. By the above description I would expect that the form of access on 1e0 would be used to the rest of the bytes of M[1].A. I would like to ask for an exact description of the behavior with this flag set: does it always as in this example? meaning that over aligned addresses it will be able to extract words even on packed structs?
Example: for this code,
typedef struct __attribute__((packed, aligned(1))) MyStruct{
int A;
short B;
char C;
} MyStruct_t;
int main(void) {
MyStruct_t M[2];
int D, E;
M[0].A = 0xffffffff;
M[1].A = 0xeeeeeeee;
D = M[0].A;
E = M[1].A;
D = E;
return 0 ;
}
compiled with --mno-unaligned-access and like that (using MCUXpresso ide):
arm-none-eabi-gcc -nostdlib -Xlinker -Map="m7_experiments.map" -Xlinker --cref -Xlinker --gc-sections -Xlinker -print-memory-usage -mcpu=cortex-m7 -mfpu=fpv5-sp-d16 -mfloat-abi=hard -mthumb -T "m7_experiments_Debug.ld" -o "m7_experiments.axf" $(OBJS) $(USER_OBJS) $(LIBS)
I'm getting the following machine code:
000001b0 <main>:
1b0: b480 push {r7}
1b2: b087 sub sp, #28
1b4: af00 add r7, sp, #0
1b6: f04f 33ff mov.w r3, #4294967295 ; 0xffffffff
1ba: 603b str r3, [r7, #0]
1bc: 2300 movs r3, #0
1be: f063 0311 orn r3, r3, #17
1c2: 71fb strb r3, [r7, #7]
1c4: 2300 movs r3, #0
1c6: f063 0311 orn r3, r3, #17
1ca: 723b strb r3, [r7, #8]
1cc: 2300 movs r3, #0
1ce: f063 0311 orn r3, r3, #17
1d2: 727b strb r3, [r7, #9]
1d2: 727b strb r3, [r7, #9]
1d4: 2300 movs r3, #0
1d6: f063 0311 orn r3, r3, #17
1da: 72bb strb r3, [r7, #10]
1dc: 683b ldr r3, [r7, #0]
1de: 617b str r3, [r7, #20]
1e0: 79fb ldrb r3, [r7, #7]
1e2: 68ba ldr r2, [r7, #8]
1e4: f022 427f bic.w r2, r2, #4278190080 ; 0xff000000
1e8: 0212 lsls r2, r2, #8
1ea: 4313 orrs r3, r2
1ec: 613b str r3, [r7, #16]
1ee: 693b ldr r3, [r7, #16]
1f0: 617b str r3, [r7, #20]
1f2: 2300 movs r3, #0
1f4: 4618 mov r0, r3
1f6: 371c adds r7, #28
1f8: 46bd mov sp, r7
1fa: f85d 7b04 ldr.w r7, [sp], #4
1fe: 4770 bx lr
EDIT: with the complementary flag munaligned-access we receive what would be expected on this case:
000001b0 <main>:
1b0: b480 push {r7}
1b2: b087 sub sp, #28
1b4: af00 add r7, sp, #0
1b6: f04f 33ff mov.w r3, #4294967295 ; 0xffffffff
1ba: 603b str r3, [r7, #0]
1bc: 2300 movs r3, #0
1be: f063 0311 orn r3, r3, #17
1c2: 71fb strb r3, [r7, #7]
1c4: 2300 movs r3, #0
1c6: f063 0311 orn r3, r3, #17
1ca: 723b strb r3, [r7, #8]
1cc: 2300 movs r3, #0
1ce: f063 0311 orn r3, r3, #17
1d2: 727b strb r3, [r7, #9]
1d4: 2300 movs r3, #0
1d6: f063 0311 orn r3, r3, #17
1da: 72bb strb r3, [r7, #10]
1dc: 683b ldr r3, [r7, #0]
1de: 617b str r3, [r7, #20]
1e0: f8d7 3007 ldr.w r3, [r7, #7]
1e4: 613b str r3, [r7, #16]
1e6: 693b ldr r3, [r7, #16]
1e8: 617b str r3, [r7, #20]
1ea: 2300 movs r3, #0
1ec: 4618 mov r0, r3
1ee: 371c adds r7, #28
1f0: 46bd mov sp, r7
1f2: f85d 7b04 ldr.w r7, [sp], #4
1f6: 4770 bx lr
The behaviour here is because even though the type is packed and potentially misaligned, the compiler knows that any instance of it on the stack must be aligned, and so aligned members of it can be accessed using word sized reads and writes.
If you access the packed struct through a pointer then the compiler doesn't know its alignment, and so the behaviour is very different.
I have not been able to reproduce this exact behaviour on godbolt because it doesn't have your version of armclang, but look at this example compiled with gcc 11:
The same lines which use a
strin the first function use fourstrbin the second.