sg12: Re: [ub] Lvalue-to-rvalue conversion and reinterpret

From: Richard Smith <richardsmith_at_[hidden]>
Date: Wed, 26 Jun 2019 16:15:35 -0700

On Wed, Jun 26, 2019 at 2:22 PM Language Lawyer <language.lawyer_at_[hidden]>
wrote:

> Excuse my annoyingness, I know I shouldn't have sent this...
>
> Lets look at the code below:
>
> int i = -1;
> unsigned u = reinterpret_cast<unsigned&>(i);
>
> To my reading of the standard, it has undefined behavior.
> `reinterpret_cast<unsigned&>(i)` is an lvalue of `unsigned int` type
> denoting the `i` object of type `int` which holds the value `-1`.
> Lvalue-to-rvalue conversion applied to this lvalue in the initialization
> produces a prvalue of type `unsigned int`.
> The result of this prvalue is determined according to [conv.lval]/3, of
> which the bullet №3.4 apply:
>
> (3.4) — Otherwise, the value contained in the object indicated by the
> glvalue is the prvalue result.
>
> So, we have an `unsigned int` prvalue with `-1` as its result.
> According to [basic.fundamental]/2, the range representable by `unsigned
> int` is from 0 to 2^N - 1 (inclusive) for some N ≥ 16.
> [expr.pre]/4 suggests that we have UB here:
>
> [expr.pre]/4 If during the evaluation of an expression, the result is not
> mathematically defined or not in the range of representable values for its
> type, the behavior is undefined.
>
> Given that on most implementations (before and, especially, after P1236R1
> was merged into the WP) one would get `u` initialized to `UINT_MAX` as the
> "result" of the UB, I propose to legalize this behavior.
>
> *Proposed wording*
>
> Modify [conv.lval]/3 as follows (changes are relative to N4820):
>
> The result of the conversion is determined according to the following
> rules:
> — If T is cv std::nullptr_t, the result is a null pointer constant
> ([conv.ptr]). [ Note: Since the conversion does not access the object to
> which the glvalue refers, there is no side effect even if T is
> volatile-qualified ([intro.execution]), and the glvalue can refer to an
> inactive member of a union ([class.union]). — end note]
> — Otherwise, if T has a class type, the conversion copy-initializes the
> result object from the glvalue.
> — Otherwise, if the object to which the glvalue refers contains an
> invalid pointer value ([basic.stc.dynamic.deallocation],
> [basic.stc.dynamic.safety]), the behavior is implementation-defined.
> <ins> — Otherwise, if T is a signed integer type and the object to which
> the glvalue refers has corresponding unsigned integer type or vice versa,
> the result is the unique value from the range of representable values for
> the type T that is congruent to the stored value of the object modulo 2^N,
> where N is the width of the type T ([basic.fundamental]).</ins>
> — Otherwise, the value contained in the object indicated by the glvalue
> is the prvalue result.
>

Seems reasonable to me. I think we're missing another case here too; from
[basic.lval]/11:

"a char, unsigned char, or std::byte type"

for which the lvalue-to-rvalue conversion should produce the value of the
corresponding byte of the object representation, suitably converted. Also,
the above change only covers the lvalue-to-rvalue conversion; we will
presumably also need to cover assignments, for which [expr.ass] says "In
simple assignment (=), the object referred to by the left operand is
modified by replacing its value with the result of the right operand." but
doesn't cover the "signed or unsigned type corresponding to the dynamic
type" case nor the "char, unsigned char, or std::byte" case.

Received on 2019-06-27 01:15:48