sg12: Re: [ub] type punning through congruent base class?

From: James Dennett <jdennett_at_[hidden]>
Date: Thu, 16 Jan 2014 23:49:43 -0800

On Thu, Jan 16, 2014 at 11:38 AM, Herb Sutter <hsutter_at_[hidden]> wrote:
> “It’s not a T until the T ctor runs” has to be part of the rules somewhere.

That's be nice, but it's hard to make it compatible with C, where you
can get a value of type T either by declaring one, or by copying an
existing one (via assignment, or via memcpy/memmove), and maybe in
other ways that I'm not remembering. Certainly C code assumes that
recreating the same bytes is sufficient to be allowed to use them as a
T object.

> It is there for non-trivially-constructible types.
>
> Do we have a defect here that we’re missing that rule for
> trivially-constructible types?

I think not, because (as I see it) such a rule would be a defect in
that it would invalidate much valid C/C++ code.

int *p = (int*)malloc(sizeof(int));
*p = 42;
*p = 0;

After this, there is an int, and p points to it, and at no time was
any constructor (trivial or otherwise) used. C doesn't have
constructors. C has regions of memory and effective types, and the
act of writing to a piece of memory sets its effective type.

We could say that each assignment above constructs an int first, but
then we'd also want to say that it destroys the previous int... except
that there wasn't an int there until the first assignment put one
there. It cannot be a precondition of "*p = 0" that p points at an
int, and it can't be a precondition that it doesn't. Both worked just
fine in C; assignment was memcpy, and gave the same object
representation, and hence the same value. (I seem to recall reading
C's rules and finding that they are also self-contradictory when read
in more depth, but I certainly lack motivation to try to fix that.)
Maybe we could say that "*p = 0" ends the lifetime of the
trivially-destructible object that was previously at p, if there was
one (but not one of type int). Seems fragile to me, but it
approximates what C permits.

I agree that what C++ says is untenable (allocating storage really
can't start the lifetime of objects of types that aren't even
expressed in code at the time the allocation happens), but C didn't
distinguish between overwriting a bunch of bytes and overwriting a
value of a type. C++ approximately mirrors C in saying that "An
object is a region of storage", but we only half mean it. We have two
notions of what objects are -- one mostly inherited from C, and a new,
arguably-cleaner one added by C++. Where the two collide we get
unpleasant quirks, like the issue raised some while ago (maybe by
Gaby) where p->~int() is explicitly a no-op, and doesn't end the
lifetime of the int.

-- James

Received on 2014-01-17 08:49:45