Date: Mon, 21 Oct 2019 12:54:03 -0700
On Mon, Oct 21, 2019 at 11:33 AM Michael Spencer <bigcheesegs_at_[hidden]>
wrote:
> On Mon, Oct 21, 2019 at 10:33 AM Andrey Semashev via Std-Discussion <
> std-discussion_at_[hidden]> wrote:
>
>> On 2019-10-21 18:36, Thiago Macieira via Std-Discussion wrote:
>> > On Monday, 21 October 2019 00:44:24 PDT Jeremy Ong via Std-Discussion
>> wrote:
>> >> You have now done an atomic load 3 times and it is possible that
>> stores or
>> >> loads to foo have occurred in between. The pattern for atomic usage is
>> to
>> >> typically load once, perform a change, and possibly compare/swap in a
>> loop
>> >> in a manner that prevents ABA bugs.
>> >
>> > If the compiler can inline those three function calls, it may also
>> coalesce
>> > the three loads into one. AFAIK, no compiler currently does it, but it
>> could.
>> > If it can't inline, then the regular rules of visibility apply and if
>> the
>> > atomic pointer's own address is not accessible by other threads and the
>> called
>> > functions, coalescing can happen again.
>> >
>> > But if it can change, then the code is actually optimal, since you do
>> have tot
>> > reload it every time. Except that you have to write:
>> >
>> > foo.load()->do_one_thing();
>> > foo.load()->do_another_thing();
>> > foo.load()->do_a_final_thing();
>>
>> I don't think the compiler can ever prove that the pointer is not
>> accessed by any other threads or signal handlers. Even in case of LTO,
>> the compiler can't be sure external threads spawned by OS or shared
>> libraries or signal handlers don't modify the atomic pointer.
>>
>
> The compiler is always* free to transform atomic accesses such that they
> are performed as if in isolation (as long as it doesn't break forward
> progress). There's no way for another thread to observe the difference
> between the compiler doing this transformation, and the hardware scheduling
> the threads and memory accesses such that this happens. This is allowed
> even if any of the do_* functions modify foo, as the compiler can just
> insert an atomic store at the end of the sequence (again, as long as it
> doesn't break forward progress).
>
I realized over lunch I should be a bit more precise here. The question I
stated below is the correct high level way to look at things, and the
statement here doesn't quite capture some of the other things that could
make the transformation invalid. It's not enough to just not break forward
progress, the compiler must also ensure that it doesn't break the order in
which memory is observed by other threads. For example, if one of the do_*
functions does an atomic SC (sequentially consistent) load of some other
variable, it must reload foo due to SC rules. Otherwise the program could
observe a state where another thread did an SC store to foo then an SC
store to some other variable, and this thread sees the store to that other
variable, but not the store to foo.
You can see this on cppmem (http://svr-pes20-cppmem.cl.cam.ac.uk/cppmem/)
with the following example:
int main() {
atomic_int foo = 2;
atomic_int z = 0;
{{{ {foo.load().readsvalue(2); // Don't observe the write to foo.
z.load().readsvalue(1); // Observe the write to z.
foo.load().readsvalue(2);} // Reuse the previous load from foo.
||| {foo.store(3);
z.store(1);}
}}};
return 0;
}
Of the 480 possible executions, none of them are consistent, thus there is
no execution that the compiler could pick that would reuse the first load
of foo, assuming it can't control the value it reads for z.
- Michael Spencer
>
> A good way to think about this is to ask the question: "Is this specific
> execution always valid according to the C++ memory model, regardless of
> what other threads do?" If so, then it's valid for the compiler to
> transform the code such that that execution does always happen.
>
> - Michael Spencer
>
wrote:
> On Mon, Oct 21, 2019 at 10:33 AM Andrey Semashev via Std-Discussion <
> std-discussion_at_[hidden]> wrote:
>
>> On 2019-10-21 18:36, Thiago Macieira via Std-Discussion wrote:
>> > On Monday, 21 October 2019 00:44:24 PDT Jeremy Ong via Std-Discussion
>> wrote:
>> >> You have now done an atomic load 3 times and it is possible that
>> stores or
>> >> loads to foo have occurred in between. The pattern for atomic usage is
>> to
>> >> typically load once, perform a change, and possibly compare/swap in a
>> loop
>> >> in a manner that prevents ABA bugs.
>> >
>> > If the compiler can inline those three function calls, it may also
>> coalesce
>> > the three loads into one. AFAIK, no compiler currently does it, but it
>> could.
>> > If it can't inline, then the regular rules of visibility apply and if
>> the
>> > atomic pointer's own address is not accessible by other threads and the
>> called
>> > functions, coalescing can happen again.
>> >
>> > But if it can change, then the code is actually optimal, since you do
>> have tot
>> > reload it every time. Except that you have to write:
>> >
>> > foo.load()->do_one_thing();
>> > foo.load()->do_another_thing();
>> > foo.load()->do_a_final_thing();
>>
>> I don't think the compiler can ever prove that the pointer is not
>> accessed by any other threads or signal handlers. Even in case of LTO,
>> the compiler can't be sure external threads spawned by OS or shared
>> libraries or signal handlers don't modify the atomic pointer.
>>
>
> The compiler is always* free to transform atomic accesses such that they
> are performed as if in isolation (as long as it doesn't break forward
> progress). There's no way for another thread to observe the difference
> between the compiler doing this transformation, and the hardware scheduling
> the threads and memory accesses such that this happens. This is allowed
> even if any of the do_* functions modify foo, as the compiler can just
> insert an atomic store at the end of the sequence (again, as long as it
> doesn't break forward progress).
>
I realized over lunch I should be a bit more precise here. The question I
stated below is the correct high level way to look at things, and the
statement here doesn't quite capture some of the other things that could
make the transformation invalid. It's not enough to just not break forward
progress, the compiler must also ensure that it doesn't break the order in
which memory is observed by other threads. For example, if one of the do_*
functions does an atomic SC (sequentially consistent) load of some other
variable, it must reload foo due to SC rules. Otherwise the program could
observe a state where another thread did an SC store to foo then an SC
store to some other variable, and this thread sees the store to that other
variable, but not the store to foo.
You can see this on cppmem (http://svr-pes20-cppmem.cl.cam.ac.uk/cppmem/)
with the following example:
int main() {
atomic_int foo = 2;
atomic_int z = 0;
{{{ {foo.load().readsvalue(2); // Don't observe the write to foo.
z.load().readsvalue(1); // Observe the write to z.
foo.load().readsvalue(2);} // Reuse the previous load from foo.
||| {foo.store(3);
z.store(1);}
}}};
return 0;
}
Of the 480 possible executions, none of them are consistent, thus there is
no execution that the compiler could pick that would reuse the first load
of foo, assuming it can't control the value it reads for z.
- Michael Spencer
>
> A good way to think about this is to ask the question: "Is this specific
> execution always valid according to the C++ memory model, regardless of
> what other threads do?" If so, then it's valid for the compiler to
> transform the code such that that execution does always happen.
>
> - Michael Spencer
>
Received on 2019-10-21 14:56:29