Does C++11 guarantee memory ordering between a release fence and a consume operation?

735 views Asked by At

Consider the following code:

struct payload
{
    std::atomic< int > value;
};

std::atomic< payload* > pointer( nullptr );

void thread_a()
{
    payload* p = new payload();
    p->value.store( 10, std::memory_order_relaxed );
    std::atomic_thread_fence( std::memory_order_release );
    pointer.store( p, std::memory_order_relaxed );
}

void thread_b()
{
    payload* p = pointer.load( std::memory_order_consume );
    if ( p )
    {
        printf( "%d\n", p->value.load( std::memory_order_relaxed ) );
    }
}

Does C++ make any guarantees about the interaction of the fence in thread a with the consume operation in thread b?

I know that in this example case I can replace the fence + atomic store with a store-release and have it work. But my question is about this particular case using the fence.

Reading the standard text I can find clauses about the interaction of a release fence with an acquire fence, and of a release fence with an acquire operation, but nothing about the interaction of a release fence and a consume operation.

Replacing the consume with an acquire would make the code standards-compliant, I think. But as far as I understand the memory ordering constraints implemented by processors, I should only really require the weaker 'consume' ordering in thread b, as the memory barrier forces all stores in thread a to be visible before the store to the pointer, and reading the payload is dependent on the read from the pointer.

Does the standard agree?

2

There are 2 answers

3
Tsyvarev On

Your code works.

I know that in this example case I can replace the fence + atomic store with a store-release and have it work. But my question is about this particular case using the fence.

Fence with relaxed atomic operation is stronger than corresponded atomic operation. E.g. (from http://en.cppreference.com/w/cpp/atomic/atomic_thread_fence, Notes):

While an atomic store-release operation prevents all preceding writes from moving past the store-release, an atomic_thread_fence with memory_order_release ordering prevents all preceding writes from moving past all subsequent stores.

0
curiousguy On

Although that's clearly the intent, the way the interaction of fences and atomic operations is specified means that only listed combinations are officially supported. (That style of specification is not only verbose, difficult to read, even more difficult to turn into a valid intuition, it's easy to make incomplete.)

I see nothing in the standard supporting pairing a consume operation with a release barrier even though it's impossible for a normal implementation to not support, except by special effort during global program optimization to detect that particular use case and deliberately break it.