I'm building an application for NXP LPC4330 (Arm Cortex M0/M4 dual core). I'm compiling using arm-none-eabi-gcc 4.9.3. At one point in my code, I am performing a write to a (32-bit) memory location. Immediately afterwards, if I read back from that memory location, around one time in ten the result indicates that the write did not occur. Subsequent reads at later times indicate the same thing, so it is not a transient condition. Interrupts are disabled at the global level, and the assembler generated by the compiler is clearly attempting the write, so how is it possible that the write is not being actioned?
Specifically, I am writing to SLICE_MUX_CFG0 which is a memory-mapped register in the SGPIO peripheral. When the write works, the peripheral functions correctly. When the read-back indicates that the write has not worked, the peripheral does not function correctly. So, it seems that the register in question is not being set correctly, as indicated by the read-back.
Looking into the .asm (listed below), the write is clear. When I read back the value afterwards, it reads as zero, which - given the listing below - seems to me to be impossible. If I perform a read immediately before the write (see the .c listing, below), the problem goes away, which is perhaps a clue.
So the above indicates, what? Does this break some rule for use of the memory bus? I've looked at the GCC bugs list and can't see anything that relates to this.
The function follows, both source and ASM, with some annotation. What could be happening, here? Why does the write at "store value" apparently not have any effect?
20000f7c <camera_SGPIO_init_sub>:
; disable interrupts globally
20000f7c: b672 cpsid i
20000f7e: 2346 movs r3, #70 ; 0x46
20000f80: 4a16 ldr r2, [pc, #88] ; (20000fdc <camera_SGPIO_init_sub+0x60>)
20000f82: 6013 str r3, [r2, #0]
20000f84: 4a16 ldr r2, [pc, #88] ; (20000fe0 <camera_SGPIO_init_sub+0x64>)
20000f86: 6013 str r3, [r2, #0]
20000f88: 4a16 ldr r2, [pc, #88] ; (20000fe4 <camera_SGPIO_init_sub+0x68>)
20000f8a: 6013 str r3, [r2, #0]
20000f8c: 4a16 ldr r2, [pc, #88] ; (20000fe8 <camera_SGPIO_init_sub+0x6c>)
20000f8e: 6013 str r3, [r2, #0]
20000f90: 4a16 ldr r2, [pc, #88] ; (20000fec <camera_SGPIO_init_sub+0x70>)
20000f92: 3301 adds r3, #1
20000f94: 6013 str r3, [r2, #0]
20000f96: 4a16 ldr r2, [pc, #88] ; (20000ff0 <camera_SGPIO_init_sub+0x74>)
20000f98: 6013 str r3, [r2, #0]
20000f9a: 4a16 ldr r2, [pc, #88] ; (20000ff4 <camera_SGPIO_init_sub+0x78>)
20000f9c: 6013 str r3, [r2, #0]
20000f9e: 4a16 ldr r2, [pc, #88] ; (20000ff8 <camera_SGPIO_init_sub+0x7c>)
20000fa0: 6013 str r3, [r2, #0]
20000fa2: 4a16 ldr r2, [pc, #88] ; (20000ffc <camera_SGPIO_init_sub+0x80>)
20000fa4: 6013 str r3, [r2, #0]
20000fa6: 2240 movs r2, #64 ; 0x40
20000fa8: 4b15 ldr r3, [pc, #84] ; (20001000 <camera_SGPIO_init_sub+0x84>)
20000faa: 601a str r2, [r3, #0]
20000fac: 2290 movs r2, #144 ; 0x90
20000fae: 4b15 ldr r3, [pc, #84] ; (20001004 <camera_SGPIO_init_sub+0x88>)
20000fb0: 0512 lsls r2, r2, #20
20000fb2: 601a str r2, [r3, #0]
; load value
20000fb4: 23c6 movs r3, #198 ; 0xc6
; load destination address
20000fb6: 4a14 ldr r2, [pc, #80] ; (20001008 <camera_SGPIO_init_sub+0x8c>)
; store value
20000fb8: 6013 str r3, [r2, #0]
; read value back
20000fba: 6810 ldr r0, [r2, #0]
20000fbc: 4a13 ldr r2, [pc, #76] ; (2000100c <camera_SGPIO_init_sub+0x90>)
20000fbe: 6013 str r3, [r2, #0]
20000fc0: 4a13 ldr r2, [pc, #76] ; (20001010 <camera_SGPIO_init_sub+0x94>)
20000fc2: 6013 str r3, [r2, #0]
20000fc4: 4a13 ldr r2, [pc, #76] ; (20001014 <camera_SGPIO_init_sub+0x98>)
20000fc6: 6013 str r3, [r2, #0]
20000fc8: 4a13 ldr r2, [pc, #76] ; (20001018 <camera_SGPIO_init_sub+0x9c>)
20000fca: 6013 str r3, [r2, #0]
20000fcc: 4a13 ldr r2, [pc, #76] ; (2000101c <camera_SGPIO_init_sub+0xa0>)
20000fce: 6013 str r3, [r2, #0]
20000fd0: 4a13 ldr r2, [pc, #76] ; (20001020 <camera_SGPIO_init_sub+0xa4>)
20000fd2: 6013 str r3, [r2, #0]
20000fd4: 4a13 ldr r2, [pc, #76] ; (20001024 <camera_SGPIO_init_sub+0xa8>)
20000fd6: 6013 str r3, [r2, #0]
; enable interrupts globally
20000fd8: b662 cpsie i
20000fda: 4770 bx lr
20000fdc: 40086480 .word 0x40086480
20000fe0: 40086484 .word 0x40086484
20000fe4: 40086488 .word 0x40086488
20000fe8: 40086494 .word 0x40086494
20000fec: 40086380 .word 0x40086380
20000ff0: 40086384 .word 0x40086384
20000ff4: 40086388 .word 0x40086388
20000ff8: 4008639c .word 0x4008639c
20000ffc: 40086208 .word 0x40086208
20001000: 40086204 .word 0x40086204
20001004: 40050064 .word 0x40050064
20001008: 40101080 .word 0x40101080
2000100c: 401010a0 .word 0x401010a0
20001010: 40101090 .word 0x40101090
20001014: 401010a4 .word 0x401010a4
20001018: 40101088 .word 0x40101088
2000101c: 401010a8 .word 0x401010a8
20001020: 40101094 .word 0x40101094
20001024: 401010ac .word 0x401010ac
The C-code which compiled to the above follows.
volatile uint32_t vol_dummy_for_read;
#define __SFS(addr, value) *((volatile uint32_t*)addr) = value;
#define SGPIO_SLICE_MUX_CFG0 (*((volatile uint32_t*) ... some address ... ))
uint32_t camera_SGPIO_init_sub()
{
__asm volatile ("cpsid i" : : : "memory");
// configure pins to SGPIO
__SFS(P9_0, SCU_SFS_INPUT | 6); // D0, SGPIO0
__SFS(P9_1, SCU_SFS_INPUT | 6);
__SFS(P9_2, SCU_SFS_INPUT | 6);
__SFS(P9_5, SCU_SFS_INPUT | 6);
__SFS(P7_0, SCU_SFS_INPUT | 7);
__SFS(P7_1, SCU_SFS_INPUT | 7);
__SFS(P7_2, SCU_SFS_INPUT | 7);
__SFS(P7_7, SCU_SFS_INPUT | 7); // D7, SGPIO7
// SGPIO8
__SFS(P4_2, SCU_SFS_INPUT | 7); // PCLK, SGPIO8
// configure pins to GPIO
__SFS(P4_1, SCU_SFS_INPUT | 0); // HSYNC, GPIO2[1]
// bring SGPIO clock up to full speed (same as PLL1, M4)
CGU_BASE_PERIPH_CLK = (0 << 1) | (0 << 11) | (9 << 24);
// SLICE_MUX_CFG
uint32_t SLICE_MUX_CFG_VALUE =
(1 << 1) /* clock on falling edge */
| (1 << 2) /* clock from external pin */
| (3 << 6) /* shift 8 bytes per clock */
;
//// see note above (this fixes it)
//vol_dummy_for_read = SGPIO_SLICE_MUX_CFG0 ;
SGPIO_SLICE_MUX_CFG0 = SLICE_MUX_CFG_VALUE; // A
uint32_t ret = SGPIO_SLICE_MUX_CFG0;
SGPIO_SLICE_MUX_CFG8 = SLICE_MUX_CFG_VALUE; // I
SGPIO_SLICE_MUX_CFG4 = SLICE_MUX_CFG_VALUE; // E
SGPIO_SLICE_MUX_CFG9 = SLICE_MUX_CFG_VALUE; // J
SGPIO_SLICE_MUX_CFG2 = SLICE_MUX_CFG_VALUE; // C
SGPIO_SLICE_MUX_CFG10 = SLICE_MUX_CFG_VALUE; // K
SGPIO_SLICE_MUX_CFG5 = SLICE_MUX_CFG_VALUE; // F
SGPIO_SLICE_MUX_CFG11 = SLICE_MUX_CFG_VALUE; // L
__asm volatile ("cpsie i" : : : "memory");
return ret;
}
(I am answering my own question; this answer was reached based on clues offered in the comments, above).
Short Answer
The peripheral is not actioning the register update reliably because the peripheral clock that is driving it (CGU_BASE_PERIPH_CLK) has only just had its speed changed at the time of the write operation. Setting the AUTOBLOCK bit when updating the clock's speed eliminates the problem.
Discussion
Presumably, the clock to the peripheral is transiently invalid during the frequency change, depending on conditions. Perhaps, if the timing of edges happens to be just-so, very short clock pulses find their way through during the change. Or something similarly unpleasant finds its way down the clock line to the peripheral. In any case, in these unpredictable conditions, the write may not occur, causing the reported failure.
Waiting for a period of time between the clock speed change and the subsequent assignment also eliminates the problem, understandably. As reported in the question, performing a read of the register prior to the write also eliminates the problem; whether this is because it takes time, or the read operation blocks (for reasons unclear) until the peripheral clock has settled, is unclear.
AUTOBLOCK is documented only as far as the statement of function: "Block clock automatically during frequency change". The User Manual gives no indication of the conditions under which the bit should be set, or left clear, during a clock speed change. However, given the evidence reported here, a policy of always setting AUTOBLOCK when updating the speed of a clock in one of these devices, unless there is a known reason to leave it clear, seems wise.
Reference: NXP User Manual for LPC43xx, UM10503 Rev 1.9, Chapter 13.