C++ Logo

SG12

Advanced search

Subject: Re: [ub] [isocpp-lib] bit_cast and indeterminate values
From: Gabriel Dos Reis (gdr_at_[hidden])
Date: 2019-06-21 21:54:09




On Jun 21, 2019, at 6:48 PM, Richard Smith via Lib <lib_at_[hidden]<mailto:lib_at_[hidden]>> wrote:

On Fri, Jun 21, 2019 at 5:46 PM Gabriel Dos Reis <gdr_at_[hidden]<mailto:gdr_at_[hidden]>> wrote:
On Jun 20, 2019, at 6:51 PM, Richard Smith <richardsmith_at_[hidden]<mailto:richardsmith_at_[hidden]>> wrote:

On Thu, Jun 20, 2019 at 2:22 PM Richard Smith <richardsmith_at_[hidden]<mailto:richardsmith_at_[hidden]>> wrote:
As currently specified, bit_cast from an indeterminate value produces an unspecified value rather than an indeterminate value. That means this can't be implemented by a simple load on some implementations, and instead will require some kind of removing-the-taint-of-an-uninitialized-value operation to be performed. (A similar concern applies to reading from padding bits.)

Is that the intent?

I chatted with JF about this. The intent is as follows:

 * bits of the input that don't have defined values result in the corresponding bit of the output being "bad"
 * if any part of a scalar object is "bad", that object has an indeterminate value

Some examples:


struct A { char c; /* char padding : 8; */ short s; };
struct B { char x[4]; };

B one() {
  A a = {1, 2};
  return std::bit_cast<B>(a);
}

In one(), the second byte of the object representation of a is bad. That means that the second byte of the produced B object is bad, so x[1] in the produced B object is an indeterminate value. The above function, if declared constexpr, would be usable in constant expressions so long as you don't look at one().x[1].


A two() {
  B b;
  b.x[0] = 'a';
  b.x[2] = 1;
  b.x[3] = 2;
  return std::bit_cast<A>(b);
}

In two(), the second byte of the object representation of b is bad. But a bit_cast to A doesn't care because it never looks at that byte. The above function returns an A with a fully-defined value. If declared constexpr, it would produce a normal, fully-initialized value.


int three() {
  int n;
  return std::bit_cast<int>(n);
}

In three(), the entirety of n is bad. A bit_cast from it produces an int whose value is indeterminate. And because we have an expression of non-byte-like type that produced an indeterminate value, the behavior is undefined.

Hmmm, isn’t it the lvalue-to-rvalue conversion that provokes the UB? I am asking a question regarding the actual cause of the UB.

This doesn't naturally fall out of the wording, because std::bit_cast takes an lvalue and returns an rvalue, and is specified in terms of poking at an object representation directly without formally performing an lvalue-to-rvalue conversion or hitting the [basic.indet] cases where the undefined value is "observed".

And I think that's actually in some ways a good thing: it shouldn't matter if some part of the input to bit_cast is uninitialized so long as the destination type has only padding in the uninitialized region.

Yeah, but beyond char-like type, I think we do hit that case frequently, e.g. in three() above, or if producing A() out of an int. I would actually help to pinpoint the cause of undefined behavior especially for tools (including the compiler) that help with sanitization - more portably.


B four() {
  int n;
  return std::bit_cast<B>(n);
}

In four(), just like three(), the entirety of n is bad, so the scalar subobjects of B are bad too. But because they're of byte-like type, that's OK: we can copy them about and produce them from prvalue expressions.


I think the above is captured by the following wording change:

Change in [bit.cast]p1:

"""
Returns: An object of type To. Each bit of the value representation of the result is equal to the
corresponding bit in the object representation of from. Padding bits of the To object are unspecified.
If there is no value of type To corresponding to the value representation produced, the behavior is
undefined. If there are multiple such values, which value is produced is unspecified.
<ins>A bit in the value representation of the result is indeterminate if does not correspond to a bit in the value
representation of from or corresponds to a bit of an object that is not within its lifetime or has an indeterminate value ([basic.indet]).
For each bit in the value representation of the result that is indeterminate,
the smallest object containing that bit has an indeterminate value;
the behavior is undefined unless that object is of unsigned ordinary character type or std::byte type.
The result does not otherwise contain any indeterminate values.</ins>
"""
_______________________________________________
ub mailing list
ub_at_[hidden]<mailto:ub_at_[hidden]>
http://www.open-std.org/mailman/listinfo/ubC0%7C636967649279508239&sdata=yVGC%2BLcrv8qS3VLwqH3Io4QFwi%2BB%2FHNtzBSYEOw6cVk%3D&reserved=0>
_______________________________________________
ub mailing list
ub_at_[hidden]<mailto:ub_at_[hidden]>
http://www.open-std.org/mailman/listinfo/ubC0%7C636967649279518231&sdata=mZvqiTG2z%2FnLEwe6%2BBMR43Ix4xguEqNIYZ3DIZ54n0o%3D&reserved=0>
_______________________________________________
Lib mailing list
Lib_at_[hidden]<mailto:Lib_at_[hidden]>
Subscription:
https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.isocpp.org%2Fmailman%2Flistinfo.cgi%2Flib&amp;data=02%7C01%7Cgdr%40microsoft.com%7Cddbbdf2e14d94235c28308d6f6b3c177%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636967649279548217&amp;sdata=996Vlf1d%2F7anaMqCALxYC%2FpJlDK5CF4QJrYkpaK%2FLNc%3D&amp;reserved=0
Link to this post: https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.isocpp.org%2Flib%2F2019%2F06%2F12307.php&amp;data=02%7C01%7Cgdr%40microsoft.com%7Cddbbdf2e14d94235c28308d6f6b3c177%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636967649279548217&amp;sdata=J5dxcbZgzf01fRMbnOPoBSuBmSCzYVrNYF7T%2BFAjPBc%3D&amp;reserved=0



SG12 list run by herb.sutter at gmail.com