C++ Logo

liaison

Advanced search

Re: [wg14/wg21 liaison] (SC22WG14.19259) C memory object model study group - uninitialised reads and padding

From: Uecker, Martin <Martin.Uecker_at_[hidden]>
Date: Wed, 14 Apr 2021 19:24:29 +0000
Am Mittwoch, den 14.04.2021, 19:19 +0000 schrieb Uecker, Martin:
> Am Mittwoch, den 14.04.2021, 15:12 -0400 schrieb Aaron Ballman:
> > On Wed, Apr 14, 2021 at 3:10 PM Uecker, Martin
> > <Martin.Uecker_at_[hidden]> wrote:
> > > Am Mittwoch, den 14.04.2021, 21:51 +0300 schrieb Ville Voutilainen:
> > > > On Wed, 14 Apr 2021 at 21:47, Jens Gustedt via Liaison
> > > > <liaison_at_[hidden]ocpp.org> wrote:
> > > > > Am 14. April 2021 20:07:18 MESZ schrieb JF Bastien <cxx_at_jfbastien.com>:
> > > > > > On Wed, Apr 14, 2021 at 11:00 AM Uecker, Martin <Martin.Uecker_at_med.uni-goettingen.de>
> > > > > > wrote:
> > > > > > > Am Mittwoch, den 14.04.2021, 08:54 -0700 schrieb JF Bastien via Liaison:
> > > > > > > > On Tue, Apr 13, 2021 at 11:40 AM Peter Sewell <Peter.Sewell_at_cl.cam.ac.uk>
> > > > > > > > wrote:
> > > > > > > > > - reading uninitialised representation bytes and padding bytes is also
> > > > > > > > > necessary for other bytewise polymorphic operations: memcmp, marshalling,
> > > > > > > > > encryption, and hashing (deferring what one knows about the results of
> > > > > > > > > such reads for a moment). It's not clear how generally these operations
> > > > > > > > > have to be supported, and we would like more data. Atomic cmpxchg on large
> > > > > > > > > structs, implemented with locks, would do a memcmp/memcpy combination (in
> > > > > > > > > fact is described as such in the standard).
> > > > > > > > >
> > > > > > > >
> > > > > > > > For atomics with padding, C++20 adopted the following change (and I expect
> > > > > > > > that compilers will implement it in previous versions as well):
> > > > > > > > http://wg21.link/P0528
> > > > > > >
> > > > > > > I am not terribly excited about this solution.
> > > > > > >
> > > > > > > I think C should stick to the memcmp/memcpy semantics of cmpxchg
> > > > > > > which operate on the representation including padding. This fits
> > > > > > > to the hardware instructions, simplifies compiler design (no
> > > > > > > need to look into each type), is easy to explain, handles all
> > > > > > > cases consistently including unions, and is what most
> > > > > > > C programmers would expect.
> > > > > >
> > > > > > OK, but it doesn't work, as explained in the paper.
> > > > >
> > > > > Well, there is actually not much of an explanation in the paper.
> > > > >
> > > > > And doesn't work isn't much of a description
> > > >
> > > > You need to look at the R0 revision
> > > > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0528r0.html
> > >
> > > The closest thing which could apply to C is the following:
> > >
> > > Padded infloop_maybe(Atomic* atomic) {
> > > Padded desired; // Padding unknown.
> > > Padded expected; // Could be different.
> > > peek("desired before", &desired);
> > > peek("expected before", &expected);
> > > peek("atomic before", atomic);
> > > while (
> > > !atomic->compare_exchange_strong(
> > > expected,
> > > desired // Padding bits added and removed here ˙ ͜ʟ˙
> > > ));
> > > peek("expected after", &expected);
> > > peek("atomic after", atomic);
> > > return expected; // Maybe changed here as well.
> > > }
> > >
> > > But the claim that this can loop indefinitely seems wrong.
> > >
> > > The padding of desired is irrelevant.
> > >
> > > If the padding of 'expected' is different from
> > > the padding of 'atomic', then there is one additional
> > > executation of the loop where 'expected' is
> > > updated to a version with the right padding. Then
> > > in the next round the compare exchange succeeds.
> >
> > Is that guaranteed? I believe padding can take on arbitrary bit
> > patterns at any point in time, so I think it could loop indefinitely
> > in theory (but likely wouldn't in practice).
>
> According to the standard text, padding takes unspecified
> values when a struct member is written - not at any time.
>
> But this is not even relvant here, because we do not
> write to a struct member, we copy the full representation
> of 'expected' as if by memcpy. This memcpy then sets

Correction: the represention bytes of 'atomic' to 'expected'

> the representation bytes to right vlaues for this to work.

That is: we then compare (as if my memcpy) the representation
bytes of 'expected' (which we just copied) with the bytes of
'atomic' (where we copied they from) and if nobody changed
'atomic' inbetween they will be the same and 'memcmp'
returns true.

Martin

Received on 2021-04-14 14:24:45