Shortly, we are talking about SoC with 2 level of caches (L1, L2). I need to flush all data from caches into main DDR memory. Question is in what order that should be done
- flush L1, flush L2
- or flush L2, flush L1.
Details:
SoC in question is AArch64 chip with 4 CPUs. Each CPU has individual L1 cache and shared L2 cache, main DDR memory is following L2 cache.
On system CPU0 starts and
- init itself
- init OS
- init Environment (effectively bunch of global variables)
- make preparations for other CPUs
- release resets for other CPUs, so they could start, init themselves and start to do a jobs.
Now CPU0, before allowing others to start, flush whole caches (both L1 & L2) in order to make global Environment variables available for others for proper initialisation. Primary initialisation is done by other CPUs with caches off, so it's important to have data in main memory not just in shared L2.
Caches are flushed by iterating over all sets/ways with dc csw ...
instruction.
Problem is that some global variables do not make a whole way down to main memory. I could see that CPUs (other than CPU0) read these variables with default values (like they were never assigned by CPU0).
Important: That happens when caches are flushed in order 'whole L1' - 'whole L2'. When I change flushing order to L2 - L1, everything is fine and CPUs read right values from memory.
But still that could be just a 'luck' with all necessary Environment variables being evicted from cache by cache controller rather than my cache flushing routine.
So what is the proper order of flashing caches? Thanks.
PS:
- I'm pretty sure that flashing routine for each cache is fine, that's simply 2
for
loops over sets & ways. At first flush whole one cache, after that whole another. - nothing is certain with caches, L1-L2 order works more often than not. But I get issue regularly enough. So 'working' in this case is just get luck data eviction.
- we are not talking about any particular OS
The ARMv8 Reference Manual says under D4.4.7:
So the correct order should be L1, then L2.