sg12: Re: [ub] Type punning to avoid copying

From: Gabriel Dos Reis <gdr_at_[hidden]>
Date: Fri, 26 Jul 2013 22:17:42 -0500

Jeffrey Yasskin <jyasskin_at_[hidden]> writes:

| On Fri, Jul 26, 2013 at 6:18 PM, Gabriel Dos Reis <gdr_at_[hidden]> wrote:
| > Jeffrey Yasskin <jyasskin_at_[hidden]> writes:
| >
| > [...]
| >
| > | (We do need to have the standard endorse using memcpy to set the
| > | object representation of a trivially-copyable struct, but I don't hear
| > | any disagreement about wanting that. I'll mail Ville about an EWG
| > | issue.)
| >
| > Actually this is on my list of issues I would like to see SG12 discuss,
| > and our first recommendation to forward to EWG+CWG at the Chicago meeting.
|
| Oh, ok. I'll leave that to you then. :)
|
| > | > People use the union hack because it is easy to program, given that it is
| > | > mainly declarative. Declare a struct that corresponds to the layout of the
| > | > data, cast the buffer pointer to the struct pointer, and it usually just
| > | > works. If it isn't that easy, people will just keep using the union hack.
| > |
| > | That's ... not the union hack. The union hack is the thing that gcc
| > | (http://gcc.gnu.org/onlinedocs/gcc-4.8.1/gcc/Optimize-Options.html#index-fstrict_002daliasing-881)
| > | and probably C99 endorse where you write one field of a union and read
| > | a different field. Simply casting a char[] to a different type is
| > | something else, which I hear more objections to simply allowing.
| >
| > In fact, I would prefer to see standards support -- either in form of
| > library functions or in form of some language features or combination --
| > for object and value representation as arrays of unsigned char, along
| > with your bit_cast. My personal preference would be that over the
| > "union hack".
|
| SGTM, although I think the fact that gcc and C99 endorse the union
| hack should push us to consider it strongly, in addition to memcpy.
| (Your message is slightly ambiguous. Responding to one interpretation,
| I'd think it was a shame if we insisted that people use a function
| that didn't exist in C++14 in order to set the object representation
| in C++17.)

I am not sure I understand, but I would say that it is appropriate for
C++17 to use constructs in C++17 to resolve any existing issue it might
have. I wouldn't call it shameful; but again, I am not sure I
understand the remark.

The language already defines object representation and value
representation and how to access them. It has a relatively well-defined
notion of object lifetime that is the bedrock of its object semantics --
and pretty much everything else.

While the standard uses the term "active field", it does not define it, nor
does it relate it to the fundamental notion of lifetime. The best I can
find is 9.5/1:

  In a union, at most one of the non-static data members can be active
  at any time, that is, the value of at most one of the non-static data
  members can be stored in a union at any time. [...]

and the non-normative note 9.5/4:

  [ Note: In general, one must use explicit destructor calls and
  placement new operators to change the active member of a union. — end
  note ] [ Example: Consider an object u of a union type U having
  non-static data members m of type M and n of type N. If M has a
  non-trivial destructor and N has a non-trivial constructor (for
  instance, if they declare or inherit virtual functions), the active
  member of u can be safely switched from m to n using the destructor
  and placement new operator as follows:
     u.m.~M();
     new (&u.n) N;
  — end example ]

If the general idea is that "active member" implies construction and
lifetime, then that should be defined, and I think that would mean that
the "union hack" solution would be even less effective.

-- Gaby

Received on 2013-07-27 05:17:59