Date: Sat, 10 Oct 2020 07:52:28 +0000
Am Samstag, den 10.10.2020, 00:32 -0400 schrieb Hubert Tong via Liaison:
> On Fri, Oct 9, 2020 at 7:14 PM Jens Maurer <Jens.Maurer_at_gmx.net> wrote:
>
> >
> > Let me add a small bit of context here.
> >
> > On 10/10/2020 00.04, Hubert Tong via Liaison wrote:
> > > Older thread appears to end here:
> > > http://open-std.org/JTC1/SC22/WG14/17199, "(SC22WG14.17199)
> >
> > terminology: indeterminate value"
> > >
> > > UB from uninitialized values emanates from indeterminate values in C++
> >
> > and trap representations in C.
There is also a special provision for automatic variables whose
address was never taken.
> > > C further has unspecified values that become unspecified at a specific
> >
> > point in time but can be copied without mutation of the value. C++ received
> > a National Body comment for C++14 where such cases were removed:
> > https://wg21.link/cwg1787.
> >
> > So, you're saying that the issue CWG 1787 highlighted for C++ is
> > equally an issue in C, and should be addressed in a similar fashion.
> >
>
> Yes, the specific example suffers from the combination of there not being
> trap representations for unsigned char and that unspecified values, once
> "discovered" are a specific value of the type.
>
> The bytes of the object representation for a union that are not also bytes
> of the object representation of the active member also become unspecified
> values in C, erasing any trap representations (because unspecified values
> in C are not trap representations).
I am not sure what this means. Representation bytes can always
be read using an lvalue of character type but if they are
unspecified could then be part of a trap representation for some
union member.
> It appears C++ is less aggressive with respect to initialized padding bytes
> in structure and union types.
>
>
> >
> > > It is further an issue that the definition of trap representation does
My main issue is that "indeterminate value" has a non-sensical
definition in C. I am not sure what C++ does here. But the
term is questionable by itself has it is meant to describe
object states that do not represent a value at all.
> > not admit trap representations for the exact-width integer types (int16_t,
> > etc.) because the lack of padding bits, combined with two's complement
> > representation, means that every possible object representation represents
> > a value of the type.
Yes.
With provenance, there is the idea that same representation could have
different meaning in different context. I am not sure whether this
plays a role here.
> > So, C says that uninitialized automatic variables have indeterminate value.
> > That's fine for some types, because the implementation can choose to use
> > a trap representation as the particular value used for that case,
> > which makes any access undefined behavior.
> > However, this avenue can't be taken for e.g. int16_t, because there are
> > no bits left for a trap representation to be a possibility.
(if the address is not taken, reading it is UB).
>
> Yes, for the C++-compatible range of int16_t, there are no bits left for a
> trap representation to be a possibility.
>
>
> > But it's really helpful for optimizers to assume that uninitialized
> > automatic variables can't be read.
> >
> > > I think it would really help if the C committee could record a decision
> >
> > to move towards harmonizing with C++ on this.
> >
> > Yes. In particular since optimizers are likely to be the same for C and
> > C++, anyway.
> >
>
> Yes, and we're at risk of getting a hodge podge mix of the C and C++ rules
> in implementations.
Yes, but keep in mind, that there are also many C compilers that
do not support C++.
Best,
Martin
> On Fri, Oct 9, 2020 at 7:14 PM Jens Maurer <Jens.Maurer_at_gmx.net> wrote:
>
> >
> > Let me add a small bit of context here.
> >
> > On 10/10/2020 00.04, Hubert Tong via Liaison wrote:
> > > Older thread appears to end here:
> > > http://open-std.org/JTC1/SC22/WG14/17199, "(SC22WG14.17199)
> >
> > terminology: indeterminate value"
> > >
> > > UB from uninitialized values emanates from indeterminate values in C++
> >
> > and trap representations in C.
There is also a special provision for automatic variables whose
address was never taken.
> > > C further has unspecified values that become unspecified at a specific
> >
> > point in time but can be copied without mutation of the value. C++ received
> > a National Body comment for C++14 where such cases were removed:
> > https://wg21.link/cwg1787.
> >
> > So, you're saying that the issue CWG 1787 highlighted for C++ is
> > equally an issue in C, and should be addressed in a similar fashion.
> >
>
> Yes, the specific example suffers from the combination of there not being
> trap representations for unsigned char and that unspecified values, once
> "discovered" are a specific value of the type.
>
> The bytes of the object representation for a union that are not also bytes
> of the object representation of the active member also become unspecified
> values in C, erasing any trap representations (because unspecified values
> in C are not trap representations).
I am not sure what this means. Representation bytes can always
be read using an lvalue of character type but if they are
unspecified could then be part of a trap representation for some
union member.
> It appears C++ is less aggressive with respect to initialized padding bytes
> in structure and union types.
>
>
> >
> > > It is further an issue that the definition of trap representation does
My main issue is that "indeterminate value" has a non-sensical
definition in C. I am not sure what C++ does here. But the
term is questionable by itself has it is meant to describe
object states that do not represent a value at all.
> > not admit trap representations for the exact-width integer types (int16_t,
> > etc.) because the lack of padding bits, combined with two's complement
> > representation, means that every possible object representation represents
> > a value of the type.
Yes.
With provenance, there is the idea that same representation could have
different meaning in different context. I am not sure whether this
plays a role here.
> > So, C says that uninitialized automatic variables have indeterminate value.
> > That's fine for some types, because the implementation can choose to use
> > a trap representation as the particular value used for that case,
> > which makes any access undefined behavior.
> > However, this avenue can't be taken for e.g. int16_t, because there are
> > no bits left for a trap representation to be a possibility.
(if the address is not taken, reading it is UB).
>
> Yes, for the C++-compatible range of int16_t, there are no bits left for a
> trap representation to be a possibility.
>
>
> > But it's really helpful for optimizers to assume that uninitialized
> > automatic variables can't be read.
> >
> > > I think it would really help if the C committee could record a decision
> >
> > to move towards harmonizing with C++ on this.
> >
> > Yes. In particular since optimizers are likely to be the same for C and
> > C++, anyway.
> >
>
> Yes, and we're at risk of getting a hodge podge mix of the C and C++ rules
> in implementations.
Yes, but keep in mind, that there are also many C compilers that
do not support C++.
Best,
Martin
Received on 2020-10-10 02:52:36