sg12: Re: [ub] Type punning to avoid copying (was: unions and undefined behavior)

From: Jeffrey Yasskin <jyasskin_at_[hidden]>
Date: Thu, 25 Jul 2013 15:24:33 -0700

On Thu, Jul 25, 2013 at 7:53 AM, Howard Hinnant
<howard.hinnant_at_[hidden]> wrote:
> It was this SO question that started this thread:
>
> http://stackoverflow.com/q/17789928/576911
>
> I'm curious: The accepted answer uses memcpy and the claim is that this is a correct answer to the question. That is it does not exhibit undefined behavior. My current understanding is that I agree with this answer. But I wanted to check here. Do people here agree that:
>
> http://stackoverflow.com/a/17790026/576911
>
> does not break the aliasing rules, or otherwise invoke undefined behavior?

I believe the memcpy answer does not break the aliasing rules, and
that it has implementation-defined behavior, but I can't quite prove
it.

[basic.types]p4 "The object representation of an object of type T is
the sequence of N unsigned char objects taken up by the object of type
T, where N equals sizeof(T). The value representation of an object is
the set of bits that hold the value of type T. For trivially copyable
types, the value representation is a set of bits in the object
representation that determines a value, which is one discrete element
of an implementation-defined set of values."

[basic.fundamental]p7 "... The representations of integral types shall
define values by use of a pure binary numeration system."

[basic.fundamental]p8 "The value representation of floating-point
types is implementation-defined."

[cstdint.syn] refers to C99 7.18 to define int32_t and uint32_t:
"7.18.1.1 Exact-width integer types
1 The typedef name intN_t designates a signed integer type with width
N , no padding bits, and a two’s complement representation. Thus,
int8_t denotes a signed integer type with a width of exactly 8 bits.
2 The typedef name uintN_t designates an unsigned integer type with
width N . Thus, uint24_t denotes an unsigned integer type with a width
of exactly 24 bits."

As Gaby mentioned, it's not clear that this memcpy works to initialize
the int32_t (maybe [basic.life]p1?), or even that a memcpy from a
non-int32_t is allowed to write to the object representation of an
int32_t. ([basic.types]p2 implies otherwise?) We should fix that to
say "yes" for any trivially-copyable type. The resulting object
representation may still be a trap representation. (Not, I think, for
int32_t, but for general types.)

I don't quite agree that "One also needs something for the memcpy()
back into x, saying that does not yield a trap representation". At
least, it doesn't need to be a rule in the language for all
implementations. If the value representation of a type is
implementation-defined, then the programmer can cooperate with her
implementation to ensure she doesn't create a trap representation,
even if another implementation running the same code would trap.

Jeffrey

Received on 2013-07-26 00:24:55