C++ Logo

liaison

Advanced search

Re: [wg14/wg21 liaison] (SC22WG14.19259) C memory object model study group - uninitialised reads and padding

From: Richard Smith <richardsmith_at_[hidden]>
Date: Wed, 14 Apr 2021 13:46:04 -0700
On Wed, Apr 14, 2021 at 12:19 PM Uecker, Martin via Liaison <
liaison_at_[hidden]> wrote:

> Am Mittwoch, den 14.04.2021, 15:12 -0400 schrieb Aaron Ballman:
> > On Wed, Apr 14, 2021 at 3:10 PM Uecker, Martin
> > <Martin.Uecker_at_[hidden]> wrote:
> > > Am Mittwoch, den 14.04.2021, 21:51 +0300 schrieb Ville Voutilainen:
> > > > On Wed, 14 Apr 2021 at 21:47, Jens Gustedt via Liaison
> > > > <liaison_at_[hidden]> wrote:
> > > > > Am 14. April 2021 20:07:18 MESZ schrieb JF Bastien <
> cxx_at_[hidden]>:
> > > > > > On Wed, Apr 14, 2021 at 11:00 AM Uecker, Martin <
> Martin.Uecker_at_[hidden]>
> > > > > > wrote:
> > > > > > > Am Mittwoch, den 14.04.2021, 08:54 -0700 schrieb JF Bastien
> via Liaison:
> > > > > > > > On Tue, Apr 13, 2021 at 11:40 AM Peter Sewell <
> Peter.Sewell_at_[hidden]>
> > > > > > > > wrote:
> > > > > > > > > - reading uninitialised representation bytes and padding
> bytes is also
> > > > > > > > > necessary for other bytewise polymorphic operations:
> memcmp, marshalling,
> > > > > > > > > encryption, and hashing (deferring what one knows about
> the results of
> > > > > > > > > such reads for a moment). It's not clear how generally
> these operations
> > > > > > > > > have to be supported, and we would like more data. Atomic
> cmpxchg on large
> > > > > > > > > structs, implemented with locks, would do a memcmp/memcpy
> combination (in
> > > > > > > > > fact is described as such in the standard).
> > > > > > > > >
> > > > > > > >
> > > > > > > > For atomics with padding, C++20 adopted the following change
> (and I expect
> > > > > > > > that compilers will implement it in previous versions as
> well):
> > > > > > > > http://wg21.link/P0528
> > > > > > >
> > > > > > > I am not terribly excited about this solution.
> > > > > > >
> > > > > > > I think C should stick to the memcmp/memcpy semantics of
> cmpxchg
> > > > > > > which operate on the representation including padding. This
> fits
> > > > > > > to the hardware instructions, simplifies compiler design (no
> > > > > > > need to look into each type), is easy to explain, handles all
> > > > > > > cases consistently including unions, and is what most
> > > > > > > C programmers would expect.
> > > > > >
> > > > > > OK, but it doesn't work, as explained in the paper.
> > > > >
> > > > > Well, there is actually not much of an explanation in the paper.
> > > > >
> > > > > And doesn't work isn't much of a description
> > > >
> > > > You need to look at the R0 revision
> > > > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0528r0.html
> > >
> > > The closest thing which could apply to C is the following:
> > >
> > > Padded infloop_maybe(Atomic* atomic) {
> > > Padded desired; // Padding unknown.
> > > Padded expected; // Could be different.
> > > peek("desired before", &desired);
> > > peek("expected before", &expected);
> > > peek("atomic before", atomic);
> > > while (
> > > !atomic->compare_exchange_strong(
> > > expected,
> > > desired // Padding bits added and removed here ˙ ͜ʟ˙
> > > ));
> > > peek("expected after", &expected);
> > > peek("atomic after", atomic);
> > > return expected; // Maybe changed here as well.
> > > }
> > >
> > > But the claim that this can loop indefinitely seems wrong.
> > >
> > > The padding of desired is irrelevant.
> > >
> > > If the padding of 'expected' is different from
> > > the padding of 'atomic', then there is one additional
> > > executation of the loop where 'expected' is
> > > updated to a version with the right padding. Then
> > > in the next round the compare exchange succeeds.
> >
> > Is that guaranteed? I believe padding can take on arbitrary bit
> > patterns at any point in time, so I think it could loop indefinitely
> > in theory (but likely wouldn't in practice).
>
> According to the standard text, padding takes unspecified
> values when a struct member is written - not at any time.
>

This is probably the relevant difference, then. I don't think the C++ model
includes an idea that padding bits have a value, let alone a stable one; a
C++ implementation is at liberty to represent a `struct { char c; int n; }`
as a `char` plus an `int`, and not explicitly model any padding between the
char and the int.


> But this is not even relvant here, because we do not
> write to a struct member, we copy the full representation
> of 'expected' as if by memcpy. This memcpy then sets
> the representation bytes to right vlaues for this to work.
>
> Best,
> Martin
>
> _______________________________________________
> Liaison mailing list
> Liaison_at_[hidden]
> Subscription: https://lists.isocpp.org/mailman/listinfo.cgi/liaison
> Link to this post: http://lists.isocpp.org/liaison/2021/04/0423.php
>

Received on 2021-04-14 15:46:18