Why is a write to a memory-mapped peripheral register not actioned (LPC43xx)?

555 views Asked by At

I'm building an application for NXP LPC4330 (Arm Cortex M0/M4 dual core). I'm compiling using arm-none-eabi-gcc 4.9.3. At one point in my code, I am performing a write to a (32-bit) memory location. Immediately afterwards, if I read back from that memory location, around one time in ten the result indicates that the write did not occur. Subsequent reads at later times indicate the same thing, so it is not a transient condition. Interrupts are disabled at the global level, and the assembler generated by the compiler is clearly attempting the write, so how is it possible that the write is not being actioned?

Specifically, I am writing to SLICE_MUX_CFG0 which is a memory-mapped register in the SGPIO peripheral. When the write works, the peripheral functions correctly. When the read-back indicates that the write has not worked, the peripheral does not function correctly. So, it seems that the register in question is not being set correctly, as indicated by the read-back.

Looking into the .asm (listed below), the write is clear. When I read back the value afterwards, it reads as zero, which - given the listing below - seems to me to be impossible. If I perform a read immediately before the write (see the .c listing, below), the problem goes away, which is perhaps a clue.

So the above indicates, what? Does this break some rule for use of the memory bus? I've looked at the GCC bugs list and can't see anything that relates to this.

The function follows, both source and ASM, with some annotation. What could be happening, here? Why does the write at "store value" apparently not have any effect?

20000f7c <camera_SGPIO_init_sub>:
; disable interrupts globally
20000f7c:   b672        cpsid   i
20000f7e:   2346        movs    r3, #70 ; 0x46
20000f80:   4a16        ldr r2, [pc, #88]   ; (20000fdc <camera_SGPIO_init_sub+0x60>)
20000f82:   6013        str r3, [r2, #0]
20000f84:   4a16        ldr r2, [pc, #88]   ; (20000fe0 <camera_SGPIO_init_sub+0x64>)
20000f86:   6013        str r3, [r2, #0]
20000f88:   4a16        ldr r2, [pc, #88]   ; (20000fe4 <camera_SGPIO_init_sub+0x68>)
20000f8a:   6013        str r3, [r2, #0]
20000f8c:   4a16        ldr r2, [pc, #88]   ; (20000fe8 <camera_SGPIO_init_sub+0x6c>)
20000f8e:   6013        str r3, [r2, #0]
20000f90:   4a16        ldr r2, [pc, #88]   ; (20000fec <camera_SGPIO_init_sub+0x70>)
20000f92:   3301        adds    r3, #1
20000f94:   6013        str r3, [r2, #0]
20000f96:   4a16        ldr r2, [pc, #88]   ; (20000ff0 <camera_SGPIO_init_sub+0x74>)
20000f98:   6013        str r3, [r2, #0]
20000f9a:   4a16        ldr r2, [pc, #88]   ; (20000ff4 <camera_SGPIO_init_sub+0x78>)
20000f9c:   6013        str r3, [r2, #0]
20000f9e:   4a16        ldr r2, [pc, #88]   ; (20000ff8 <camera_SGPIO_init_sub+0x7c>)
20000fa0:   6013        str r3, [r2, #0]
20000fa2:   4a16        ldr r2, [pc, #88]   ; (20000ffc <camera_SGPIO_init_sub+0x80>)
20000fa4:   6013        str r3, [r2, #0]
20000fa6:   2240        movs    r2, #64 ; 0x40
20000fa8:   4b15        ldr r3, [pc, #84]   ; (20001000 <camera_SGPIO_init_sub+0x84>)
20000faa:   601a        str r2, [r3, #0]
20000fac:   2290        movs    r2, #144    ; 0x90
20000fae:   4b15        ldr r3, [pc, #84]   ; (20001004 <camera_SGPIO_init_sub+0x88>)
20000fb0:   0512        lsls    r2, r2, #20
20000fb2:   601a        str r2, [r3, #0]
; load value
20000fb4:   23c6        movs    r3, #198    ; 0xc6
; load destination address
20000fb6:   4a14        ldr r2, [pc, #80]   ; (20001008 <camera_SGPIO_init_sub+0x8c>)
; store value
20000fb8:   6013        str r3, [r2, #0]
; read value back
20000fba:   6810        ldr r0, [r2, #0]
20000fbc:   4a13        ldr r2, [pc, #76]   ; (2000100c <camera_SGPIO_init_sub+0x90>)
20000fbe:   6013        str r3, [r2, #0]
20000fc0:   4a13        ldr r2, [pc, #76]   ; (20001010 <camera_SGPIO_init_sub+0x94>)
20000fc2:   6013        str r3, [r2, #0]
20000fc4:   4a13        ldr r2, [pc, #76]   ; (20001014 <camera_SGPIO_init_sub+0x98>)
20000fc6:   6013        str r3, [r2, #0]
20000fc8:   4a13        ldr r2, [pc, #76]   ; (20001018 <camera_SGPIO_init_sub+0x9c>)
20000fca:   6013        str r3, [r2, #0]
20000fcc:   4a13        ldr r2, [pc, #76]   ; (2000101c <camera_SGPIO_init_sub+0xa0>)
20000fce:   6013        str r3, [r2, #0]
20000fd0:   4a13        ldr r2, [pc, #76]   ; (20001020 <camera_SGPIO_init_sub+0xa4>)
20000fd2:   6013        str r3, [r2, #0]
20000fd4:   4a13        ldr r2, [pc, #76]   ; (20001024 <camera_SGPIO_init_sub+0xa8>)
20000fd6:   6013        str r3, [r2, #0]
; enable interrupts globally
20000fd8:   b662        cpsie   i
20000fda:   4770        bx  lr
20000fdc:   40086480    .word   0x40086480
20000fe0:   40086484    .word   0x40086484
20000fe4:   40086488    .word   0x40086488
20000fe8:   40086494    .word   0x40086494
20000fec:   40086380    .word   0x40086380
20000ff0:   40086384    .word   0x40086384
20000ff4:   40086388    .word   0x40086388
20000ff8:   4008639c    .word   0x4008639c
20000ffc:   40086208    .word   0x40086208
20001000:   40086204    .word   0x40086204
20001004:   40050064    .word   0x40050064
20001008:   40101080    .word   0x40101080
2000100c:   401010a0    .word   0x401010a0
20001010:   40101090    .word   0x40101090
20001014:   401010a4    .word   0x401010a4
20001018:   40101088    .word   0x40101088
2000101c:   401010a8    .word   0x401010a8
20001020:   40101094    .word   0x40101094
20001024:   401010ac    .word   0x401010ac

The C-code which compiled to the above follows.

volatile uint32_t vol_dummy_for_read;
#define __SFS(addr, value) *((volatile uint32_t*)addr) = value;
#define SGPIO_SLICE_MUX_CFG0 (*((volatile uint32_t*)  ... some address ... ))

uint32_t camera_SGPIO_init_sub()
{
    __asm volatile ("cpsid i" : : : "memory");

    //  configure pins to SGPIO
    __SFS(P9_0, SCU_SFS_INPUT | 6); // D0, SGPIO0
    __SFS(P9_1, SCU_SFS_INPUT | 6);
    __SFS(P9_2, SCU_SFS_INPUT | 6);
    __SFS(P9_5, SCU_SFS_INPUT | 6);
    __SFS(P7_0, SCU_SFS_INPUT | 7);
    __SFS(P7_1, SCU_SFS_INPUT | 7);
    __SFS(P7_2, SCU_SFS_INPUT | 7);
    __SFS(P7_7, SCU_SFS_INPUT | 7); // D7, SGPIO7

    //  SGPIO8
    __SFS(P4_2, SCU_SFS_INPUT | 7); // PCLK, SGPIO8

    //  configure pins to GPIO
    __SFS(P4_1, SCU_SFS_INPUT | 0); // HSYNC, GPIO2[1]

    //  bring SGPIO clock up to full speed (same as PLL1, M4)
    CGU_BASE_PERIPH_CLK = (0 << 1) | (0 << 11) | (9 << 24);

    //  SLICE_MUX_CFG
    uint32_t SLICE_MUX_CFG_VALUE =
          (1 << 1) /* clock on falling edge */
        | (1 << 2) /* clock from external pin */
        | (3 << 6) /* shift 8 bytes per clock */
        ;

////    see note above (this fixes it)
//vol_dummy_for_read = SGPIO_SLICE_MUX_CFG0 ;

    SGPIO_SLICE_MUX_CFG0  = SLICE_MUX_CFG_VALUE; // A
    uint32_t ret = SGPIO_SLICE_MUX_CFG0;
    SGPIO_SLICE_MUX_CFG8  = SLICE_MUX_CFG_VALUE; // I
    SGPIO_SLICE_MUX_CFG4  = SLICE_MUX_CFG_VALUE; // E
    SGPIO_SLICE_MUX_CFG9  = SLICE_MUX_CFG_VALUE; // J
    SGPIO_SLICE_MUX_CFG2  = SLICE_MUX_CFG_VALUE; // C
    SGPIO_SLICE_MUX_CFG10 = SLICE_MUX_CFG_VALUE; // K
    SGPIO_SLICE_MUX_CFG5  = SLICE_MUX_CFG_VALUE; // F
    SGPIO_SLICE_MUX_CFG11 = SLICE_MUX_CFG_VALUE; // L

    __asm volatile ("cpsie i" : : : "memory");

    return ret;
}
1

There are 1 answers

2
Rattus Ex Machina On

(I am answering my own question; this answer was reached based on clues offered in the comments, above).

Short Answer

The peripheral is not actioning the register update reliably because the peripheral clock that is driving it (CGU_BASE_PERIPH_CLK) has only just had its speed changed at the time of the write operation. Setting the AUTOBLOCK bit when updating the clock's speed eliminates the problem.

Discussion

Presumably, the clock to the peripheral is transiently invalid during the frequency change, depending on conditions. Perhaps, if the timing of edges happens to be just-so, very short clock pulses find their way through during the change. Or something similarly unpleasant finds its way down the clock line to the peripheral. In any case, in these unpredictable conditions, the write may not occur, causing the reported failure.

Waiting for a period of time between the clock speed change and the subsequent assignment also eliminates the problem, understandably. As reported in the question, performing a read of the register prior to the write also eliminates the problem; whether this is because it takes time, or the read operation blocks (for reasons unclear) until the peripheral clock has settled, is unclear.

AUTOBLOCK is documented only as far as the statement of function: "Block clock automatically during frequency change". The User Manual gives no indication of the conditions under which the bit should be set, or left clear, during a clock speed change. However, given the evidence reported here, a policy of always setting AUTOBLOCK when updating the speed of a clock in one of these devices, unless there is a known reason to leave it clear, seems wise.

Reference: NXP User Manual for LPC43xx, UM10503 Rev 1.9, Chapter 13.