sg12: Re: [ub] bit_cast and indeterminate values

From: Richard Smith <richardsmith_at_[hidden]>
Date: Thu, 20 Jun 2019 18:50:42 -0700

On Thu, Jun 20, 2019 at 2:22 PM Richard Smith <richardsmith_at_[hidden]>
wrote:

> As currently specified, bit_cast from an indeterminate value produces an
> unspecified value rather than an indeterminate value. That means this can't
> be implemented by a simple load on some implementations, and instead will
> require some kind of removing-the-taint-of-an-uninitialized-value operation
> to be performed. (A similar concern applies to reading from padding bits.)
>
> Is that the intent?
>

I chatted with JF about this. The intent is as follows:

* bits of the input that don't have defined values result in the
corresponding bit of the output being "bad"
* if any part of a scalar object is "bad", that object has an
indeterminate value

Some examples:

struct A { char c; /* char padding : 8; */ short s; };
struct B { char x[4]; };

B one() {
  A a = {1, 2};
  return std::bit_cast<B>(a);
}

In one(), the second byte of the object representation of a is bad. That
means that the second byte of the produced B object is bad, so x[1] in the
produced B object is an indeterminate value. The above function, if
declared constexpr, would be usable in constant expressions so long as you
don't look at one().x[1].

A two() {
  B b;
  b.x[0] = 'a';
  b.x[2] = 1;
  b.x[3] = 2;
  return std::bit_cast<A>(b);
}

In two(), the second byte of the object representation of b is bad. But a
bit_cast to A doesn't care because it never looks at that byte. The above
function returns an A with a fully-defined value. If declared constexpr, it
would produce a normal, fully-initialized value.

int three() {
  int n;
  return std::bit_cast<int>(n);
}

In three(), the entirety of n is bad. A bit_cast from it produces an int
whose value is indeterminate. And because we have an expression of
non-byte-like type that produced an indeterminate value, the behavior is
undefined.

B four() {
  int n;
  return std::bit_cast<B>(n);
}

In four(), just like three(), the entirety of n is bad, so the scalar
subobjects of B are bad too. But because they're of byte-like type, that's
OK: we can copy them about and produce them from prvalue expressions.

I think the above is captured by the following wording change:

Change in [bit.cast]p1:

"""
Returns: An object of type To. Each bit of the value representation of the
result is equal to the
corresponding bit in the object representation of from. Padding bits of the
To object are unspecified.
If there is no value of type To corresponding to the value representation
produced, the behavior is
undefined. If there are multiple such values, which value is produced is
unspecified.
<ins>A bit in the value representation of the result is indeterminate if
does not correspond to a bit in the value
representation of from or corresponds to a bit of an object that is not
within its lifetime or has an indeterminate value ([basic.indet]).
For each bit in the value representation of the result that is
indeterminate,
the smallest object containing that bit has an indeterminate value;
the behavior is undefined unless that object is of unsigned ordinary
character type or std::byte type.
The result does not otherwise contain any indeterminate values.</ins>
"""

Received on 2019-06-21 03:50:55