sg12: Re: [ub] [c++std-core-23844] Re: unions and undefined behavior

From: Richard Smith <richardsmith_at_[hidden]>
Date: Wed, 24 Jul 2013 11:46:56 -0700

On Wed, Jul 24, 2013 at 10:27 AM, Jeffrey Yasskin <jyasskin_at_[hidden]>wrote:

> On Wed, Jul 24, 2013 at 8:43 AM, Howard Hinnant
> <howard.hinnant_at_[hidden]> wrote:
> > On Jul 23, 2013, at 10:36 PM, Lawrence Crowl <Lawrence_at_[hidden]> wrote:
> >
> >>> This isn't an academic argument. If we can keep the compiler from
> >>> optimizing away the following example code (which has UB from what
> I've been
> >>> reading here), we make some parts of the std::library components 2 to 3
> >>> times faster. That's a much better optimization than what the
> compilers are
> >>> getting from actively exploiting UB.
> >>>
> >>> #include <utility>
> >>> #include <string>
> >>>
> >>> template <class T, class U>
> >>> struct pair
> >>> {
> >>> T first;
> >>> U second;
> >>> };
> >>>
> >>> template <class T, class U>
> >>> union V
> >>> {
> >>> typedef pair< T, U> NPair;
> >>> typedef pair<const T, U> CPair;
> >>> NPair n;
> >>> CPair c;
> >>>
> >>> V() : c() {}
> >>> V(const T& t) : c{t} {}
> >>> V(const T& t, const U& u) : c{t, u} {}
> >>> V(const V& v) : c(v.c) {}
> >>> V& operator=(const V& v) {n = v.n; return *this;}
> >>> V(V&& v) : n(std::move(v.n)) {}
> >>> V& operator=(V&& v) {n = std::move(v.n); return *this;}
> >>> ~V() {c.~CPair();}
> >>> };
> >>>
> >>> struct X
> >>> {
> >>> std::string s;
> >>> };
> >>>
> >>> struct Y
> >>> {
> >>> std::string s;
> >>> };
> >>>
> >>> int
> >>> main()
> >>> {
> >>> typedef V<X, Y> Z;
> >>> Z z1(X{"some "}, Y{"text"});
> >>> Z z2 = std::move(z1);
> >>> }
> >>
> >> Can you explain your reasoning here?
> >>
> >> BTW, I think there is potential optimization in knowing that a value
> >> will not change, and so writing into a const variable strikes me as
> >> being locally optimistic and globally pessimistic.
> >
> > This is a crude model of std::map<T, U>::value_type that would allow
> map's copy assignment operator to recycle nodes, instead of deleting all
> nodes in the lhs prior to inserting new ones. Here is a an example chart
> of the performance of the copy assignment of a map<int, int> when the rhs
> has 1000 values, the lhs has between 0 and 2000 values (prior to the
> assignment), and the implementation is varied between a simple
> clear()/insert(), a structural copy (avoids rebalancing the lhs at each
> insert), and recycling nodes using something very similar to what is
> sketched above with the union V.
> >
>
> Thanks for the concrete use case.
>
> There's always the hack of destroying and re-constructing a node when
> you want to re-use its memory without going through the global
> allocator. Then you have to figure out what you need to do with
> pointers to keep them valid. e.g. is "p->~T(); new(p) T(...); use(p);"
> valid, or must one write "p->~T(); p = new(p) T(...); use(p);"? (note
> the extra "p=") [basic.life]p7 addresses this ... and says that for
> class types with a const-qualified data member, you do need to
> re-assign the pointer. (C++03 through C++14)
>

Right, under the current standard, that is the way to recycle the memory
and avoid undefined behavior.

Can you try that approach and see if the no-op assignments cost
> measurable performance? If they do, probably a clang or llvm bug would
> be in order.
>
> However, I agree that this is more subtle than I'm happy with. We
> already have the notion of layout-compatible types, but T and const T
> aren't layout-compatible (C++14[basic.types]p11). What would go wrong
> if we said that cv-qualified versions of T were layout-compatible with
> T?

It's important to understand the goal of [basic.life]p7 here. It's easiest
to think about this rule from an implementer's perspective.

>From a programmer's perspective, this rule says that if you have a pointer
or reference to an object, and you destroy the object and create a new one
in the same place, that old pointer or reference can only be used to access
the new object if the old and new types match and don't contain any const
subobjects or reference subobjects.

>From an implementer's perspective, this rule says that you can fold
together a common subexpression that loads from the same position in the
same object, if you're loading a const data member, a reference member, or
a vptr. (The vptr case in particular is important for performance, since it
allows a lot more reasoning about the dynamic type of an object, and thus
more inlining of virtual function calls.)

If we permitted const and non-const types to be layout-compatible, we'd
break [basic.life]p7's guarantees and undermine its intent, and an
implementation would no longer be able to perform common subexpression
elimination on loads of the same const data member of the same object.

Also, the common initial sequence rules only permit "inspect[ing] the
common initial part". So either the active member is pair<T,U> and users of
the map can't modify the .second of values through a pair<const T,U>, or
the active member is pair<const T,U> and Howard's recycling trick does not
work because .first is const. We would require additional changes to the
common initial sequence rules to allow this; changing "layout-compatible"
is insufficient.

Received on 2013-07-24 20:46:58