Does std::atomic provide atomic behavior, regardless of ordering?

414 views Asked by At

If a variable is declared with the std::atomic template, such as std::atomic<int>, is it guaranteed that access via the methods in std::atomic will result in a consistent value (that is, one written by a write through the std::atomic methods) regardless of ordering?

As far as I know, this equivalent to asking whether reads or writes can be "torn" - that is, written or real in multiple parts visible at the ISA level.

1

There are 1 answers

5
Peter Cordes On BEST ANSWER

Atomic, from the Greek atom meaning indivisible, is synonymous with "no tearing". It means that the entire operation happens indivisbly. Everything you can do with a std::atomic type is always atomic (no tearing).

C++14 draft N4140 section 29.3 Order and consistency is the first part of that chapter that gets down to details. One of the very first points is (1.4):

[ Note: Atomic operations specifying memory_order_relaxed are relaxed with respect to memory ordering. Implementations must still guarantee that any given atomic access to a particular atomic object be indivisible with respect to all other atomic accesses to that object. — end note ]

As far as technical language that lays out the atomicity requirement, every operation (like .store(), .load(), .fetch_add()) is defined with language like:

§ 29.6.5 Requirements for operations on atomic types

void atomic_store(volatile A * object, C desired) noexcept;
void atomic_store(A * object, C desired) noexcept;
void atomic_store_explicit(volatile A * object, C desired, memory_order order) noexcept;
void atomic_store_explicit(A * object, C desired, memory_order order) noexcept;
void A ::store(C desired, memory_order order = memory_order_seq_cst) volatile noexcept;
void A ::store(C desired, memory_order order = memory_order_seq_cst) noexcept;
  1. Requires: The order argument shall not be memory_order_consume, memory_order_acquire, nor memory_order_acq_rel.
  2. Effects: Atomically replaces the value pointed to by object or by this with the value of desired. Memory is affected according to the value of order.

And so on, using the word Atomically in every case it applies.

Instead of repeating themselves for add/+, sub/-, and so for |, &, and ^, they have a key/op table that applies to this block:

C atomic_fetch_key (volatile A * object, M operand) noexcept;
C atomic_fetch_key (A * object, M operand) noexcept;
C atomic_fetch_key _explicit(volatile A * object, M operand, memory_order order) noexcept;
C atomic_fetch_key _explicit(A * object, M operand, memory_order order) noexcept;
C A ::fetch_key (M operand, memory_order order = memory_order_seq_cst) volatile noexcept;
C A ::fetch_key (M operand, memory_order order = memory_order_seq_cst) noexcept;
  • 28 Effects: Atomically replaces the value pointed to by object or by this with the result of the computation applied to the value pointed to by object or by this and the given operand. Memory is affected according to the value of order. These operations are atomic read-modify-write operations (1.10).
  • 29 Returns: Atomically, the value pointed to by object or by this immediately before the effects.
  • 30 Remark: For signed integer types, arithmetic is defined to use two’s complement representation. There are no undefined results. For address types, the result may be an undefined address, but the operations otherwise have no undefined behavior.

The only thing that's optional is ordering / synchronization with loads/stores in other threads (for atomicity without synchronization, use memory_order_relaxed).

In fact, there's no way to "turn off" atomicity for loading a wide type where that's expensive (e.g. before a CAS on atomic<pointer_and_ABAcounter> which compiles to lock cmpxchg16b on x86). I used a union hack in that answer to efficiently load the struct.

And more importantly, the union is a workaround for gcc not optimizing ptr_and_counter.ptr to a load of just the pointer, which I think is safe at least on x86. Instead gcc insists on atomically loading the whole struct and then getting the pointer from the result. That's very bad when it's a 16-byte struct on x86-64, and fairly bad on x86-32. (See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80835)