C++ Logo

SG12

Advanced search

Subject: Re: [ub] type punning through congruent base class?
From: Herb Sutter (hsutter_at_[hidden])
Date: 2014-01-17 19:12:04


For the first example: I’d be inclined to view it as type-unsafe because of the cast from void* to B*, not because of the malloc.

For the rest: At quick glance, the my_malloc/my_free internals look type safe to me, so the issue isn’t wrapping malloc/free per se. Rather, it’s the calling code that’s unsafe for casting from void* to B*, which is unchanged in the optimized version of the code.

So IMO the three examples are all the same because they all contain the unsafe cast from void* to B*, just moving it around a little.


Moving to a question you didn’t ask: What if my_malloc/my_free returned/took not void*, but my_class* (so a class-specific allocator, implemented using malloc/free)? That is:

std::map<size_t, std::stack<my_class*>> size_classes = {{16, {}}, {32, {}}, ...};

my_class* my_malloc(size_t size) {
  auto size_class = size_classes.lower_bound(size);
  assert(size_class != size_classes.end());
  if (size_class->second.empty())
    return (my_class*)malloc(size_class->first);
  void* result = size_class->second.top();
  size_class->second.pop();
  return result;
}

void my_free(size_t size, my_class* block) {
  size_classes.lower_bound(size)->second.push(block);
}

Then this would require some decoration around the (single) cast to my_class*, perhaps:

  ...
  if (size_class->second.empty()) {
    my_class* ret = nullptr;
    extern “c-style” {
      ret = (my_class*)malloc(size_class->first);
    }
    return ret;
  }
  â€¦

And this doesn’t surprise me because I would expect the internals of an allocator to be a classic example of code that resorts to the explicit type-unsafe escape hatch (“extern “C-style” { }” block around or whatever).

Herb


From: ub-bounces_at_[hidden] [mailto:ub-bounces_at_[hidden]] On Behalf Of Jeffrey Yasskin
Sent: Friday, January 17, 2014 4:21 PM
To: WG21 UB study group
Subject: Re: [ub] type punning through congruent base class?

I'd like to complicate things further, in response to the idea that the snippet involving 'B' and 'short' is type-punning and that we should consider a type-safe mode. :)

First, consider whether the following code, intended to be completely normal C code:

  B* pb = (B*)malloc(sizeof(B));
  pb->i = 0;
  free(pb);
  short* ps = (short*)malloc(sizeof(short));
  *ps = 0;
  free(ps);

looks like code that should be valid in this type-safe mode. If you'd like to ban it, the rest of this post won't have anything interesting for you.

Now imagine I write a library (please forgive compile and logic errors, and typos):

std::map<size_t, std::stack<void*>> size_classes = {{16, {}}, {32, {}}, ...};

void* my_malloc(size_t size) {
  auto size_class = size_classes.lower_bound(size);
  assert(size_class != size_classes.end());
  if (size_class->second.empty())
    return malloc(size_class->first);
  void* result = size_class->second.top();
  size_class->second.pop();
  return result;
}

void my_free(size_t size, void* block) {
  size_classes.lower_bound(size)->second.push(block);
}

Then I use it like:

  B* pb = (B*)my_malloc(sizeof(B));
  pb->i = 0;
  my_free(sizeof(B), pb);
  short* ps = (short*)my_malloc(sizeof(short));
  *ps = 0;
  my_free(sizeof(short), ps);

Is this worse than the above malloc/free-based code? That is, can users write wrappers around malloc and free?

But the compiler can inline the my_malloc use down to:

  void *p = malloc(16); // Probably 16.
  B* pb = (B*)p;
  pb->i = 0;
  short* ps = (short*)pb;
  *ps = 0;

using knowledge of the behavior of std::map and std::stack, which is nearly identical to the "type-punning" code. So how do we make the type-punning invalid without breaking standard malloc-based code or user-written libraries?

On Fri, Jan 17, 2014 at 2:56 PM, Herb Sutter <hsutter_at_[hidden]<mailto:hsutter_at_[hidden]>> wrote:
>> Note that this post from Herb arrived after
>> http://www.open-std.org/pipermail/ub/2014-January/000418.html but was sent
>> before, so the thread got a little mixed up.
>
> Yes, I've been trying to reply less on this thread until that sync'ed back
> up. :)
>
> From what I've learned in this thread, the (rough) intended C++ model for
> PODs (assuming memory of the right size/alignment) would seem to be "the
> lifetime of a B starts when you write to the memory as a B, and ends when
> you free the memory or write to the memory as different type." [Disclaimer:
> I'm not sure if "read from the memory as a B" also starts lifetime."]
>
> I think we can do better, but it seems like that's the (rough) intent of the
> status quo, leaving aside the question of whether the wording actually says
> that.
>
> *If* that is the (rough) intent, then in:
>
>> void *p = malloc(sizeof(B)); // 1
>>
>> B* pb = (B*)p; // 2
>>
>> pb->i = 0; // 3
>>
>> short* ps = (short*)p; // 4
>> *ps = 0; // 5
>>
>> free(p); // 6
>
>
> I assume that the reasoning would be that:
>
> line 3 starts the lifetime of a B (we're writing to the bits of a B member,
> not just any int)
> line 5 ends the lifetime of that B and begins the lifetime of a short
> line 6 ends the lifetime of that short
>
>
> Again ignoring whether this is desirable, is that (roughly) the intent of
> the current wording?
>
>
> If yes, does the wording express it (a) accurately and (b) clearly?
>
>
> Finally, regardless of the above answer, do we want to change anything about
> the legality or semantics of the above type-punning code, such as possibly
> having a "type-safe mode" where such code is somehow not allowed unless in
> an "extern "C-compat"" block or something?
>
>
> Herb
>
>
>
> ________________________________
> From: ub-bounces_at_[hidden]<mailto:ub-bounces_at_[hidden]> <ub-bounces_at_[hidden]<mailto:ub-bounces_at_[hidden]>> on behalf of Jeffrey
> Yasskin <jyasskin_at_[hidden]<mailto:jyasskin_at_[hidden]>>
> Sent: Friday, January 17, 2014 1:34 PM
>
> To: WG21 UB study group
> Subject: Re: [ub] type punning through congruent base class?
>
> Note that this post from Herb arrived after
> http://www.open-std.org/pipermail/ub/2014-January/000418.html but was sent
> before, so the thread got a little mixed up.
>
> On Thu, Jan 16, 2014 at 11:38 AM, Herb Sutter <hsutter_at_[hidden]<mailto:hsutter_at_[hidden]>> wrote:
>>
>> Richard, it cannot mean that (or if it does, IMO we have an obvious bug)
>> for at least two specific reasons I can think of (below), besides the
>> general reasons that it would not be sensical and would violate type safety.
>
>
> We do have an obvious bug in [basic.life]p1, "The lifetime of an object of
> type T begins when storage with the proper alignment and size for type T is
> obtained", if we interpret "obtained" as "obtained from the memory
> allocator". Even with strict uses of placement-new to change the type of
> memory, placement-new doesn't "obtain" any memory. If we interpret
> "obtained" as just "the programmer intends a region of storage to be
> available for a T", as I think Richard is suggesting, the bug is only that
> we need the wording to be clearer.
>
>> First, objects must have unique addresses. Consider, still assuming B is
>> trivially constructible:
>>
>> void *p = malloc(sizeof(B));
>
>
> The lifetime of a B starts some time after-or-including the malloc() call in
> the above line and the access of 'pb->i' two lines down. [basic.life]p5
> ("Before the lifetime of an object has started ... The program has undefined
> behavior if ... the pointer is used to access a non-static data member")
>
> The assignment to 'i' might start the lifetime of an 'int' subobject, but
> that's not enough to make the use of 'pb->i' defined if no 'B's lifetime has
> started.
>
>>
>> B* pb = (B*)p;
>> pb->i = 0;
>
>
> The lifetime of the B *ends* when its storage is re-used for the 'short'
> ([basic.life]p1 "The lifetime of an object of type T ends when ... the
> storage which the object occupies is reused"), as Daveed said. This happens
> some time after the access in the previous line, and the assignment two
> lines down.
>
>>
>> short* ps = (short*)p;
>> *ps = 0;
>>
>> This cannot possibly be construed as starting the lifetime of a B object
>> and a short object, else they would have the same address, which is illegal.
>> Am I missing something?
>
>
> Both a B object and a short object have their lifetimes started in your code
> snippet, but the lifetimes don't overlap.
>
> Confusingly, the start of these lifetimes is *not* called out in any
> particular line of code; it's implied by them. In particular, the casts
> don't have any lifetime effects (contra the straw man at
> http://www.open-std.org/pipermail/ub/2014-January/000406.html). The code
> would be just as defined (or undefined) written as:
>
> void *p = malloc(sizeof(B));
>
> B* pb = (B*)p;
> short* ps = (short*)p;
> pb->i = 0;
>
> *ps = 0;
>
>
> As Matt alluded to in
> http://www.open-std.org/pipermail/ub/2014-January/000456.html, it might be
> possible to say that all lifetime effects are called out in explicit
> expressions without breaking C compatibility, *if* we instead say that
> accessing the members of objects with trivial constructors can be done
> outside of the lifetime of such objects. I have no idea whether that would
> be better or worse than saying that lifetime effects can be implied.
>
>
> Jeffrey
>
>
> _______________________________________________
> ub mailing list
> ub_at_[hidden]<mailto:ub_at_[hidden]>
> http://www.open-std.org/mailman/listinfo/ub
>



SG12 list run by herb.sutter at gmail.com