C++ Logo

std-proposals

Advanced search

Re: P1839 and the object representation of subobjects

From: Jason McKesson <jmckesson_at_[hidden]>
Date: Tue, 21 Jul 2020 22:32:04 -0400
On Tue, Jul 21, 2020 at 7:58 PM language.lawyer--- via Std-Proposals
<std-proposals_at_[hidden]> wrote:
>
> On 22/07/2020 02:29, Thiago Macieira via Std-Proposals wrote:
> > On Tuesday, 21 July 2020 12:44:13 PDT language.lawyer--- via Std-Proposals
> > wrote:
> >> auto qoptr2 = reinterpret_cast<QObject *>(
> >> reinterpret_cast<byte *>(c1ptr) - off
> >> );
> >> assert(qoptr == qoptr2);
> >>
> >> if `c1ptr` really points to member subobject, `reinterpret_cast<byte
> >> *>(c1ptr)` points to the **first** element of its object representation, so
> >> subtracting a positive number from such a pointer value is UB because of
> >> [expr.add]/4.
> >
> > IIUC, what you're saying is that, given:
> >
> > struct S
> > {
> > int i, j;
> > };
> >
> > You can take a pointer to S and get to its underlying representation as an
> > array of bytes, so you can do arithmetic plus std::launder and get to the
> > member j.
> >
> > expr.add/4 says about negative offsetting:
> >
> > [...] the expression P - J points to the (possibly-hypothetical)
> > array element i−j of x if 0≤i−j≤n.
> > Otherwise, the behavior is undefined.
> >
> > IIUC, you're saying that casting the int pointer to a byte representation
> > returns the first (index 0) element of the representation, so subtracting a
> > positive number results in a negative index, which means UB.
> >
> > Why can't it be understood to return the 5th element of S's representation?
>
> To my reading of P1839R2, an object of type S and its member subobjects corresponding to S::i and S::j NSDMs all have their own object representation[ array]s.
>
> Section 6.3 "The std::launder issue" says:
> > Multiple objects may occupy the same storage, in which case the objects’ respective object representations will overlap
>
> The proposing wording for [intro.object] says:
> """
> The object representation of an object a of type cv T is a sequence of N cv unsigned char objects that occupy the same storage as a, where N is equal to sizeof(T). The sequence is considered to be an array of N T if the object occupies contiguous bytes of storage... The object representation of an object nested within an object o is guaranteed to appear in the object representation of o.
> """
>
> So:
> 1) This wording doesn't say that this apply only to complete objects, so there must be a separate array for each object, even subobjects.
> 2) I've asked what the last sentence mean
> https://lists.isocpp.org/std-proposals/2019/08/0263.php:
> > And what do you mean by "APPEAR in the object representation of `o`"?
> > "Appear" means that the object representation of an object `a` nested within an object `b` shares *the same* `unsigned char` objects or there are two *different* sets of `unsigned char` objects in the region of storage occupied by both `a` and `b`?
>
> And got the following answer (https://lists.isocpp.org/std-proposals/2019/08/0264.php):
> >> And what do you mean by "APPEAR in the object representation of `o`"?
> > For each object in the object representation of the object nested within o, there will be a corresponding object of the same value in the object representation of o
>
> Here it is said that only *values* of object representation elements are the same, not that they are the same objects. To me, it reads as that the author confirms that there are separate object representation arrays (when they are considered to be arrays) containing object representation elements for each object, even if this object is nested within some other object.

It's also important to recognize that there must be a way to have
contiguous objects nested within non-contiguous ones, and that it's
not reasonable to be able to go from one contiguous object to a
separate contiguous object through a non-contiguous object. I think
that preventing this is one of the reasons why the text reads as it
does.

That being said, it's also important to recognize that the "sequence
of unsigned chars" is not an array. There are no arrays within that
sequence at all. There are some sequences of these chars which are
*considered* arrays, but they're not actually arrays.

Because we're not dealing with a real array object, only what bytes
are "considered" to be, it wouldn't be too difficult to make this
work. We would need to add a statement that would say that, if an
object is of a contiguous type, then the "sequence of unsigned chars"
for their object representations are all considered part of the same
array. And there would need to be a corresponding statement where when
you get the object representation of an object from a
`reinterpret_cast`, if that object is nested within a contiguous
object, the pointer you get back is a pointer into the "considered"
array for the largest containing contiguous object.

Received on 2020-07-21 21:35:33