ISOCPP Std Discussion List: Re: UB in P2641 'Checking if a union alternative is active'

From: Matthew House <mattlloydhouse_at_[hidden]>
Date: Tue, 20 Jun 2023 11:58:33 -0400

On Tue, Jun 20, 2023 at 2:27 AM Jens Maurer via Std-Discussion
<std-discussion_at_[hidden]> wrote:
> On 20/06/2023 04.11, Matthew House via Std-Discussion wrote:
> > On Mon, Jun 19, 2023 at 8:46 PM Brian Bi via Std-Discussion
> > <std-discussion_at_[hidden]> wrote:
> >>> More generally, such an interpretation would completely break
> >>> mechanisms enabled by [basic.lval]/11 using pointers that happen to
> >>> refer to union members, since inactive union members would always have
> >>> preference over reinterpretations allowed by the rule. For instance,
> >>> suppose that u.c were declared as an unsigned char instead of a char.
> >>> Then, std::memcpy(dest, &u.i, sizeof(int)) would be UB, since by
> >>> reinterpreting its argument as an array of unsigned char, memcpy would
> >>> produce a pointer to u.c, then read past its end. I don't think that's
> >>> something that can be considered reasonable.
> >>
> >> Well, `std::memcpy` can be defined by magic to do the right thing, but I guess you're talking about a user-written analogue. Still, I don't understand your argument. Under the current wording, you can't write such a thing and have it have well-defined behavior according to the letter of the law, regardless of what view you take on whether the `u.c` object exists when it's not active.
> >
> > I'll admit, I don't understand the argument that P1839 seems to hinge
> > on, to argue that even reading the first byte of the object
> > representation is UB:
> >
> >> When a is dereferenced, the behaviour is undefined as per [expr.pre]
> >> p4 because the value of the resulting expression would not be the
> >> value of the first byte, but the value of the whole int object
> >> (123456), which is not a value representable by unsigned char.
>
> In recent years, we've come to understand better that "the object the pointer
> points to" may be different from "the pointee of the type of the pointer".
>
> For example, when casting a point to T to a pointer to void, the pointer
> still points to a T object, although the type of the expression doesn't
> say so. Or, by chaining two static_casts, you can actually have a pointer
> of type "pointer to char" have a value that actually points to an object of
> type int.

Of course. I'm not trying to dispute the general idea that the pointer
type can be distinct from the dynamic type of the object. Instead, I'm
disputing that scalar accesses (lvalue-to-rvalue conversions and
assignments) in particular are and ought to be beholden to the dynamic
type of the object, since otherwise [basic.lval]/11 is a nearly
meaningless clause. Indeed, given the subject, I'm surprised that the
paper doesn't mention the clause at all, even to explain how it
doesn't help.

(And the paper's proposal would not fully solve the meaninglessness
of [basic.lval]/11 that it infers: by its logic, reinterpreting a
negative integer as an unsigned integer via type punning is instant
UB, compared to the analogous wording in C, where the relevant details
of the object representation are merely implementation-defined.)

> > This interpretation appears to defeat the entire purpose of
> > the first sentence in [basic.lval]/11, which I will repeat here for
> > reference:
> >
> >> If a program attempts to access (3.1) the stored value of an object
> >> through a glvalue whose type is not similar (7.3.6) to one of the
> >> following types the behavior is undefined:
> >> - the dynamic type of the object,
> >> - a type that is the signed or unsigned type corresponding to the
> >> dynamic type of the object, or
> >> - a char, unsigned char, or std::byte type.
>
> > I have always imagined this as implying a series of steps for
> > performing a read where the type of the glvalue is not similar to the
> > dynamic type of the object:
> > 1. Locate the object referred to by the glvalue.
> > 2. Select the appropriate bytes in the object representation.
>
> That's exactly the problem: There is no talk about "object representation"
> in the existing text here.
>
> > 3. Interpret those bytes as a value of the glvalue's type.
> > 4. Return the resulting value.
> > (The reverse process would occur for a modification.)
> >
> > Indeed, [basic.lval]/11 originates from an analogous clause in
> > standard C, which at another point explicitly clarifies the supremacy
> > of the lvalue's type: "The meaning of a value stored in an object or
> > returned by a function is determined by the *type* of the expression
> > used to access it."
>
> We can't have this in C++, because you (always) could have a pointer-to-
> base class refer to an object that is actually a of a derived class type.

Sure, obviously the full principle wouldn't make sense given C++'s
features. But I'm talking about scalar accesses and scalar accesses
alone, for which the reinterpretations in [basic.lval]/11 are mostly
harmless to the broader type system and used every day by C programs.

> > But C++ isn't so clear about the result when an object is
> > reinterpreted as another type using [basic.lval]/11. Apart from
> > [basic.lval]/11 itself, the most relevant language I could find is in
> > [conv.lval]/3:
> >
> >> The result of the conversion is determined according to the
> >> following rules:
> >> [...]
> >> - Otherwise, the object indicated by the glvalue is read (3.1), and
> >> the value contained in the object is the prvalue result.
> >
> > "The object indicated by the glvalue" is surely the object that the
> > glvalue refers to, but "the value contained in the object" is somewhat
> > ambiguous, especially since the clause references no mechanism for
> > converting to the glvalue's type. Is "the value contained" exactly the
> > value of the object in its dynamic type? Or is "the value contained"
> > the value resulting from interpreting the object representation as a
> > value of the glvalue's type? P1839 briefly assumes the former, but I
> > don't see how that interpretation can be squared with the purpose of
> > [basic.lval]/11.
>
> See, it's not so easy.

My problem is that the paper simply assumes the answer and goes from
there, unless you can point me to another source where the former
interpretation has been clarified. Indeed, that interpretation further
requires saying that the values of the integer types are members of
the full mathematical set of integers, not just members of disjoint
width-specific sets (denoted by value representations) that are
considered to represent mathematical integers. (Otherwise, the prvalue
result could never be *the* value contained in the object, if the
prvalue's type has a different width than the object's type.) The
standard never says anything so strong as that, that I can see.

> > (If we were to make the second interpretation explicit in
> > [conv.lval]/3 and [expr.ass]/2, it would obviate the first problem
> > brought up in the paper. Yet the problem of allowing pointer
> > arithmetic with an unsigned char* pointer to a general object would
> > remain. But the paper's proposal seems quite ugly to me; in my view,
> > this would be more cleanly solved by introducing a new kind of
> > pointer, ...
>
> The goal is to make code that "should work" (because it has worked
> in C and C++ forever) just work by putting a suitable model underneath
> it, not to introduce new kinds of pointers (which would not help
> existing code).

Why exactly wouldn't such a solution help existing code? Are there any
particular operations that require the existence of a special array
object, rather than just the ability to access part of an object
representation using a byte type? (My whole thought here is to avoid
all the issues in the paper where you thought you got a pointer to an
object, when actually you just got a pointer to the object
representation.)

Received on 2023-06-20 15:58:45