Do I need to use volatile keyword for memory access in critical section?

809 views Asked by At

I am writing code for a single processor 32 bit microcontroller using gcc.

I need to consume time-stamped objects from a linked list. Another part of the code which could be asynchronous (maybe in an ISR) adds them to the list.

The critical section is implemented by turning interrupts off and using the barrier() function.

I'm confused where gcc optimization could break my code by cacheing pointers to the list items (next most recent item to remove, list head, or free list). I dont want anything inside the while loop to be cached from a previous time around the loop. Will the memory barrier protect me from the compiler deciding to load a pointer once at the start of the function and never reload it again? All these list pointers might be modified in the critical section of the producer code (not shown). I am trying to understand if pqueue_first should be a volatile pointer, for example.

Presumably, if there was no loop (which is the case for adding to the list), I am ok if all the code in a function is in a critical section?

Please don't just point me to some generic article about volatile or critical sections because I have read a bunch of them, but I am having trouble seeing how to apply it to this specific code. I understand that volatile ensures that the compiler will reload the variable every time it is referenced. But I don't understand the likely scope of optimization and its interaction with memory barriers.

typedef struct {
    EV_EventQueueEntry_t *pqueue_alloc; // allocation (never changes)
    EV_EventQueueEntry_t *pqueue_head; // head of active queue (ISR can change it)
    EV_EventQueueEntry_t *pqueue_free; // head of free list (ISR can change it)
    EV_EventQueueEntry_t *pqueue_first; // soonest item in queue (ISR can change it)
    EV_EventQueueEntry_t *pqueue_first_prev; // back pointer from soonest item (ISR can change it)
    EV_UInt_t max_event_count;
} EV_EventQueue_t;

void RunLoop(EV_EventQueue_t *pev)
{
    while(not timeout)
    {
        // Enter critical section
        disable_interrupts();
        barrier();

        // item with most recent timestamp
        // this can be changed by ISR add to queue operation
        EV_EventQueueEntry_t *pfirst = pev->pqueue_first;

        if(pfirst!=NULL && EV_PortIsFutureTime(pfirst->event.timestamp, EV_PortGetTime()))
        {
            // Copy out message
            EV_Event_t e = pfirst->event;

            // Remove event from queue
            if(pev->pqueue_first_prev != NULL)
                pev->pqueue_first_prev->pnext = pfirst->pnext;
            else
                pev->pqueue_head = pfirst->pnext;

            // Put event back on free list
            pfirst->pnext = pev->pqueue_free;
            pev->pqueue_free = pfirst;
            pfirst->event.message.type = EV_MESSAGE_NULL;

            // Find next soonest message to process after this one
            pev->pqueue_first = ...;
            pev->pqueue_first_prev = ...; // back pointer

            // Exit critical section
            barrier();
            enable_interrupts();

            // Dispatch message
            ...
        }
        else
        {
            // Exit critical section
            barrier();
            enable_interrupts();

            // waste some time
            ...
        }
    }
}
2

There are 2 answers

0
Nick Wilkerson On

The word volatile tells the compiler to retrieve a value from memory and not from cache and to store values in memory and not in cache. This is done when multiple cores could be operating on the same memory so you are not necessarily guaranteed that the cache is fresh.

In your case, the ISR will modify memory so any variables that it accesses should be labeled volatile. It is not really compiler optimization that is going to cause a problem. The problem is caused by a context switch that occurs when your ISR happens. The state of the processor (register values) are stored and after the ISR, restored, which means that the values in the registers will be what they were before the ISR.

0
Arch D. Robison On

C++11 has a standard feature for this: std::atomic_signal_fence. C11 has a similar feature, without the namespace qualifier. It is suitable if your program uses only a single thread and you are thus just trying to stop the compiler from moving loads/stores across the fence. Use std::atomic_signal_fence(memory_order_acquire) before the critical section and std:atomic_signal_fence(memory_order_release) after the critical section.

If you're not using C++11 or C11, but are just using gcc or a compiler that understands gcc asms, you can use __asm__ __volatile__ ("": : :"memory") for a compiler barrier. The asm says that it can't be removed and threatens to modify memory in mysterious ways, hence the compiler won't be able to move loads/stores over it.