C++ Logo

liaison

Advanced search

Re: [wg14/wg21 liaison] C trap representations and unspecified values versus C++ indeterminate values

From: Hubert Tong <hubert.reinterpretcast_at_[hidden]>
Date: Sat, 10 Oct 2020 00:32:52 -0400
On Fri, Oct 9, 2020 at 7:14 PM Jens Maurer <Jens.Maurer_at_[hidden]> wrote:

>
> Let me add a small bit of context here.
>
> On 10/10/2020 00.04, Hubert Tong via Liaison wrote:
> > Older thread appears to end here:
> > http://open-std.org/JTC1/SC22/WG14/17199, "(SC22WG14.17199)
> terminology: indeterminate value"
> >
> > UB from uninitialized values emanates from indeterminate values in C++
> and trap representations in C.
> > C further has unspecified values that become unspecified at a specific
> point in time but can be copied without mutation of the value. C++ received
> a National Body comment for C++14 where such cases were removed:
> https://wg21.link/cwg1787.
>
> So, you're saying that the issue CWG 1787 highlighted for C++ is
> equally an issue in C, and should be addressed in a similar fashion.
>
Yes, the specific example suffers from the combination of there not being
trap representations for unsigned char and that unspecified values, once
"discovered" are a specific value of the type.

The bytes of the object representation for a union that are not also bytes
of the object representation of the active member also become unspecified
values in C, erasing any trap representations (because unspecified values
in C are not trap representations).

It appears C++ is less aggressive with respect to initialized padding bytes
in structure and union types.


>
> > It is further an issue that the definition of trap representation does
> not admit trap representations for the exact-width integer types (int16_t,
> etc.) because the lack of padding bits, combined with two's complement
> representation, means that every possible object representation represents
> a value of the type.
>
> So, C says that uninitialized automatic variables have indeterminate value.
> That's fine for some types, because the implementation can choose to use
> a trap representation as the particular value used for that case,
> which makes any access undefined behavior.
> However, this avenue can't be taken for e.g. int16_t, because there are
> no bits left for a trap representation to be a possibility.
>
Yes, for the C++-compatible range of int16_t, there are no bits left for a
trap representation to be a possibility.


> But it's really helpful for optimizers to assume that uninitialized
> automatic variables can't be read.
>
> > I think it would really help if the C committee could record a decision
> to move towards harmonizing with C++ on this.
>
> Yes. In particular since optimizers are likely to be the same for C and
> C++, anyway.
>
Yes, and we're at risk of getting a hodge podge mix of the C and C++ rules
in implementations.


>
> Jens
>

Received on 2020-10-09 23:33:11