C++ Logo

liaison

Advanced search

Re: [wg14/wg21 liaison] C trap representations and unspecified values versus C++ indeterminate values

From: Peter Sewell <Peter.Sewell_at_[hidden]>
Date: Sat, 10 Oct 2020 11:42:00 +0100
On Sat, 10 Oct 2020 at 05:33, Hubert Tong via Liaison <
liaison_at_[hidden]> wrote:

> On Fri, Oct 9, 2020 at 7:14 PM Jens Maurer <Jens.Maurer_at_[hidden]> wrote:
>
>>
>> Let me add a small bit of context here.
>>
>> On 10/10/2020 00.04, Hubert Tong via Liaison wrote:
>> > Older thread appears to end here:
>> > http://open-std.org/JTC1/SC22/WG14/17199, "(SC22WG14.17199)
>> terminology: indeterminate value"
>> >
>> > UB from uninitialized values emanates from indeterminate values in C++
>> and trap representations in C.
>> > C further has unspecified values that become unspecified at a specific
>> point in time but can be copied without mutation of the value. C++ received
>> a National Body comment for C++14 where such cases were removed:
>> https://wg21.link/cwg1787.
>>
>> So, you're saying that the issue CWG 1787 highlighted for C++ is
>> equally an issue in C, and should be addressed in a similar fashion.
>>
> Yes, the specific example suffers from the combination of there not being
> trap representations for unsigned char and that unspecified values, once
> "discovered" are a specific value of the type.
>

Kayvan Memarian, Victor Gomes, and I have previously thought about this in
some detail - see eg this:
https://www.cl.cam.ac.uk/~pes20/cerberus/notes98-2018-04-21-uninit-v4.html

(which is an update of our 2018-03 WG14 Brno meeting papers:
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2220.htm
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2221.htm)

I've not gone through it again now to check that I'd still propose exactly
this, but a lot of the discussion and examples will still be relevant in
any case.

Peter



>
> The bytes of the object representation for a union that are not also bytes
> of the object representation of the active member also become unspecified
> values in C, erasing any trap representations (because unspecified values
> in C are not trap representations).
>
> It appears C++ is less aggressive with respect to initialized padding
> bytes in structure and union types.
>
>
>>
>> > It is further an issue that the definition of trap representation does
>> not admit trap representations for the exact-width integer types (int16_t,
>> etc.) because the lack of padding bits, combined with two's complement
>> representation, means that every possible object representation represents
>> a value of the type.
>>
>> So, C says that uninitialized automatic variables have indeterminate
>> value.
>> That's fine for some types, because the implementation can choose to use
>> a trap representation as the particular value used for that case,
>> which makes any access undefined behavior.
>> However, this avenue can't be taken for e.g. int16_t, because there are
>> no bits left for a trap representation to be a possibility.
>>
> Yes, for the C++-compatible range of int16_t, there are no bits left for a
> trap representation to be a possibility.
>
>
>> But it's really helpful for optimizers to assume that uninitialized
>> automatic variables can't be read.
>>
>> > I think it would really help if the C committee could record a decision
>> to move towards harmonizing with C++ on this.
>>
>> Yes. In particular since optimizers are likely to be the same for C and
>> C++, anyway.
>>
> Yes, and we're at risk of getting a hodge podge mix of the C and C++ rules
> in implementations.
>
>
>>
>> Jens
>>
> _______________________________________________
> Liaison mailing list
> Liaison_at_[hidden]
> Subscription: https://lists.isocpp.org/mailman/listinfo.cgi/liaison
> Link to this post: http://lists.isocpp.org/liaison/2020/10/0204.php
>

Received on 2020-10-10 05:42:14