Date: Fri, 19 Feb 2021 22:31:15 +0100
On 17/02/2021 20.51, Uecker, Martin wrote:
> Am Mittwoch, den 17.02.2021, 19:13 +0100 schrieb Jens Maurer:
>> However, this seemed to pessimize optimizations:
>>
>> unsigned char c; // not initialized
>> (void) &c; // make sure it's not in a register
>> unsigned char d = c; // fixate the value of c
>> unsigned char e = d;
>> assert(d == e);
>>
>> While we don't know the value of d, the approach that only
>> objects have indeterminate values, not expressions, made us
>> guarantee d == e for this case (the value of c was fixated
>> into d when read).
>
> Yes, this is what C has now, although we would say the value
> is unspecified.
>> So, we had C++ core issue 1787
>> http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1787
>> which introduced the current elaborated scheme
>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n3914.html
>>
>> that established the current, more elaborate, approach.
>> (The text was meanwhile reshuffled editorially.)
>
> Ok, this is the version I have question about.
>
> The defect (?) is that "it would be more helpful for
> optimizers". But it breaks even minimum guarantees needed
> fo basic reasoning about such values, i.e. that if
> you copy it you get the same (what is encoded in the
> 'assert' above). Wouldn't it make sense to make
> it completely undefined? In other words, how are
> these residual rules still useful?
So, for anything but "unsigned char", even looking at
an uninitialized value is undefined behavior in C++
(in C parlance, there might be a trap representation).
I think the argument for the current rules in C++ is
that you should be able to implement memcpy on structs
with padding bytes by looping over "unsigned char".
It doesn't matter whether the padding bytes in the
original and the copy have the same value, because
memcmp is deemed unreliable on data with padding
bytes anyway.
>>> If the byte is uninitialized the value is unspecified
>>> but consistent.
>>> In C++ this does not seem to work as
>>> indeterminate values read then cause UB later.
>>
>> In C++, you are allowed to copy indeterminate unsigned char values.
>> Any other dealing with indeterminate values is undefined behavior.
>
> "copy" in the sense of using it to make some other
> variable also indeterminate.
Yes. The "indeterminate" state is something beyond the
bit-pattern representation of the value that happens to
be at the memory location.
Jens
> Am Mittwoch, den 17.02.2021, 19:13 +0100 schrieb Jens Maurer:
>> However, this seemed to pessimize optimizations:
>>
>> unsigned char c; // not initialized
>> (void) &c; // make sure it's not in a register
>> unsigned char d = c; // fixate the value of c
>> unsigned char e = d;
>> assert(d == e);
>>
>> While we don't know the value of d, the approach that only
>> objects have indeterminate values, not expressions, made us
>> guarantee d == e for this case (the value of c was fixated
>> into d when read).
>
> Yes, this is what C has now, although we would say the value
> is unspecified.
>> So, we had C++ core issue 1787
>> http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1787
>> which introduced the current elaborated scheme
>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n3914.html
>>
>> that established the current, more elaborate, approach.
>> (The text was meanwhile reshuffled editorially.)
>
> Ok, this is the version I have question about.
>
> The defect (?) is that "it would be more helpful for
> optimizers". But it breaks even minimum guarantees needed
> fo basic reasoning about such values, i.e. that if
> you copy it you get the same (what is encoded in the
> 'assert' above). Wouldn't it make sense to make
> it completely undefined? In other words, how are
> these residual rules still useful?
So, for anything but "unsigned char", even looking at
an uninitialized value is undefined behavior in C++
(in C parlance, there might be a trap representation).
I think the argument for the current rules in C++ is
that you should be able to implement memcpy on structs
with padding bytes by looping over "unsigned char".
It doesn't matter whether the padding bytes in the
original and the copy have the same value, because
memcmp is deemed unreliable on data with padding
bytes anyway.
>>> If the byte is uninitialized the value is unspecified
>>> but consistent.
>>> In C++ this does not seem to work as
>>> indeterminate values read then cause UB later.
>>
>> In C++, you are allowed to copy indeterminate unsigned char values.
>> Any other dealing with indeterminate values is undefined behavior.
>
> "copy" in the sense of using it to make some other
> variable also indeterminate.
Yes. The "indeterminate" state is something beyond the
bit-pattern representation of the value that happens to
be at the memory location.
Jens
Received on 2021-02-19 15:31:20