sg12: Re: [ub] type punning through congruent base class?

From: Gabriel Dos Reis <gdr_at_[hidden]>
Date: Thu, 16 Jan 2014 20:58:31 -0800

Kazutoshi Satoda <k_satoda_at_[hidden]> writes:

| On 2014/01/17 5:53 +0900, Gabriel Dos Reis wrote:
| > | struct B { int x };
| > | struct B* p = (B*) malloc(sizeof(B));
| > | p->x = 17;
| > |
| > | This (modulo the cast) has been how C has handled dynamic allocation
| > | of structs approximately forever. There are no constructors in C,
| > | structs don't get initialized, their fields just get assigned to.
| >
| > Could you walk us through the C standards and explain what you
| > believe this program fragment is supposed to do and what actually is
| > at the address pointed by p?
| > I think that will be a very illuminating exercise -- and would
| > possibly help clear some confusions.
|
| Let me try the exercise. I'm referring WG14 N1570.
| http://www.open-std.org/jtc1/sc22/wg14/www/standards.html
|
|
| p points an object (a region of data storage, 3.15) which is allocated
| by malloc(). The object has allocated storage duration (6.2.4) and its
| lifetime starts at the allocation in malloc(), and ends at deallocation
| in free() (7.22.3).
|
| "p->x" yields an lvalue of type int which designates an object at
| address pointed by ((char*)p + offsetof(B, x)). There is no rule about
| the effective type of the object pointed by p to evaluate the "->"
| expression. (6.5.2.3)
|
| "p->x = 17" makes the effective type of the object between &p->x and
| (&p->x + 1) becomes int (6.5 p6), and stores value 17 (a value
| representation of int which represents 17) into that object (6.5.16).

Agreed. In particular, at this point, we still do not have an object of
type B -- from C's perspective.

| (Note)
| In C, there is no such a thing like "an object of type int" unlike in
| C++ where an object has a type and the type is determined when the
| object is created. In C, "object" is a merely a region of data storage,
| and may be labeled by an effective type which is determined by ongoing
| or previous access to the object at a time.

Yes. This goes back to the comment I made:

# Even if we take the C model, it is not clear what is there. We can
# infer that after the statement 'p->x = 17;' the storage at address
# &p-x' has effective type 'int', but that does not say much about the
# rest. I -suspect- C is OK with that because fundamentally, its object
# model is structural.

| If "struct B x = *p" follows the above code, the access (lvalue
| conversion, 6.3.2.1 p2) on *p, which is allowed by the aliasing rule
| (6.5 p7, bullet 5 "an aggregate or union ..."), changes the effective
| type of the object at p.

Indeed. In particular, the mere write to p->x does not imply that there
is an object of effective type B at p. Another important difference
between C and C++ is that the dynamic type (the closest approximation of
'effective type') of an object does not change during its lifetime.

Going back to constructor, what C++ has done is to automatically
"bless" or perform the conceptial equivalent of access that stamps the
effective type of the object. That also highlights why C++ is carefully
in what can be done between the timeframe where storage is obtained and
initialization is complete (e.g. the effective type/dynamic type is
stamped.)

Now, in most code fragments that have been shown so far (except of
course your last addition) there has been no attempt to access the
storage at p as B, yet it has been claimed that an object of type B existed
there. I don't believe that is correct.

Bonus points: after

struct B x = *p;

what is the effective type of the object at p and &p->x? B and int or B
and B? Can we even give a meaning to the question?

-- Gaby

Received on 2014-01-17 05:58:35