On Mon, Oct 21, 2019 at 10:33 AM Andrey Semashev via Std-Discussion <std-discussion@lists.isocpp.org> wrote:
On 2019-10-21 18:36, Thiago Macieira via Std-Discussion wrote:
> On Monday, 21 October 2019 00:44:24 PDT Jeremy Ong via Std-Discussion wrote:
>> You have now done an atomic load 3 times and it is possible that stores or
>> loads to foo have occurred in between. The pattern for atomic usage is to
>> typically load once, perform a change, and possibly compare/swap in a loop
>> in a manner that prevents ABA bugs.
>
> If the compiler can inline those three function calls, it may also coalesce
> the three loads into one. AFAIK, no compiler currently does it, but it could.
> If it can't inline, then the regular rules of visibility apply and if the
> atomic pointer's own address is not accessible by other threads and the called
> functions, coalescing can happen again.
>
> But if it can change, then the code is actually optimal, since you do have tot
> reload it every time. Except that you have to write:
>
> foo.load()->do_one_thing();
> foo.load()->do_another_thing();
> foo.load()->do_a_final_thing();

I don't think the compiler can ever prove that the pointer is not
accessed by any other threads or signal handlers. Even in case of LTO,
the compiler can't be sure external threads spawned by OS or shared
libraries or signal handlers don't modify the atomic pointer.

The compiler is always* free to transform atomic accesses such that they are performed as if in isolation (as long as it doesn't break forward progress). There's no way for another thread to observe the difference between the compiler doing this transformation, and the hardware scheduling the threads and memory accesses such that this happens. This is allowed even if any of the do_* functions modify foo, as the compiler can just insert an atomic store at the end of the sequence (again, as long as it doesn't break forward progress).

A good way to think about this is to ask the question: "Is this specific execution always valid according to the C++ memory model, regardless of what other threads do?" If so, then it's valid for the compiler to transform the code such that that execution does always happen.

- Michael Spencer