I am reading the article "C++ and the Perils of Double-Checked Locking" which explains the problems in DCLP.
The second part of the article (where the link forwards) shows how to try and solve DCLP with merely C/C++ volatile (which from what I know, it is impossible). In the article the writers explain how to do that (last example is number 11), but than they write:
Unfortunately, all this does nothing to address the first problem—C++'s abstract machine is single threaded, and C++ compilers may choose to generate thread-unsafe code from source like that just mentioned, anyway. Otherwise, lost optimization opportunities lead to too big an efficiency hit. After all this, we're back to square one. But wait, there's more—more processors.
Which means (if I understand correctly), that it doesn't matter how well we will use volatile, it won't work because "C++'s abstract machine is single threaded, and C++ compilers may choose to generate thread-unsafe code from source like that just mentioned"
But what does that mean "C++'s abstract machine is single threaded"?!
Why does the above examples with all of those volatiles won't prevent the reordering?
Thanks!
Since C++11, your bold marked sentence isn´t true anymore.
What it meant in the past:
The OS/device may support multiple threads, including functions to start them etc..
C++ compilers on the other hand etc. "think" of single thread environments, and are not aware of possible problems when using multiple threads. A thread start isn´t anything else but a normal function call for them, and that the OS does something strange to the process because of that call is neither known nor interesting.
Code reordering in single thread environments is possible as long as the reordered code parts are independent from each other (eg. the order how a variable is written/read to/from makes code using this variable dependent). In a multithread environment, the compiler can´t possibly know if and when a variable is influenced by another thread...
Now, in C++11/C++14, there is OS-independent support
for preventing optimizations breaking threaded code.