C++ Logo


Advanced search

Re: [ub] [c++std-core-23844] Re: unions and undefined behavior

From: Jeffrey Yasskin <jyasskin_at_[hidden]>
Date: Wed, 24 Jul 2013 10:27:24 -0700
On Wed, Jul 24, 2013 at 8:43 AM, Howard Hinnant
<howard.hinnant_at_[hidden]> wrote:
> On Jul 23, 2013, at 10:36 PM, Lawrence Crowl <Lawrence_at_[hidden]> wrote:
>>> This isn't an academic argument. If we can keep the compiler from
>>> optimizing away the following example code (which has UB from what I've been
>>> reading here), we make some parts of the std::library components 2 to 3
>>> times faster. That's a much better optimization than what the compilers are
>>> getting from actively exploiting UB.
>>> #include <utility>
>>> #include <string>
>>> template <class T, class U>
>>> struct pair
>>> {
>>> T first;
>>> U second;
>>> };
>>> template <class T, class U>
>>> union V
>>> {
>>> typedef pair< T, U> NPair;
>>> typedef pair<const T, U> CPair;
>>> NPair n;
>>> CPair c;
>>> V() : c() {}
>>> V(const T& t) : c{t} {}
>>> V(const T& t, const U& u) : c{t, u} {}
>>> V(const V& v) : c(v.c) {}
>>> V& operator=(const V& v) {n = v.n; return *this;}
>>> V(V&& v) : n(std::move(v.n)) {}
>>> V& operator=(V&& v) {n = std::move(v.n); return *this;}
>>> ~V() {c.~CPair();}
>>> };
>>> struct X
>>> {
>>> std::string s;
>>> };
>>> struct Y
>>> {
>>> std::string s;
>>> };
>>> int
>>> main()
>>> {
>>> typedef V<X, Y> Z;
>>> Z z1(X{"some "}, Y{"text"});
>>> Z z2 = std::move(z1);
>>> }
>> Can you explain your reasoning here?
>> BTW, I think there is potential optimization in knowing that a value
>> will not change, and so writing into a const variable strikes me as
>> being locally optimistic and globally pessimistic.
> This is a crude model of std::map<T, U>::value_type that would allow map's copy assignment operator to recycle nodes, instead of deleting all nodes in the lhs prior to inserting new ones. Here is a an example chart of the performance of the copy assignment of a map<int, int> when the rhs has 1000 values, the lhs has between 0 and 2000 values (prior to the assignment), and the implementation is varied between a simple clear()/insert(), a structural copy (avoids rebalancing the lhs at each insert), and recycling nodes using something very similar to what is sketched above with the union V.

Thanks for the concrete use case.

There's always the hack of destroying and re-constructing a node when
you want to re-use its memory without going through the global
allocator. Then you have to figure out what you need to do with
pointers to keep them valid. e.g. is "p->~T(); new(p) T(...); use(p);"
valid, or must one write "p->~T(); p = new(p) T(...); use(p);"? (note
the extra "p=") [basic.life]p7 addresses this ... and says that for
class types with a const-qualified data member, you do need to
re-assign the pointer. (C++03 through C++14)

Can you try that approach and see if the no-op assignments cost
measurable performance? If they do, probably a clang or llvm bug would
be in order.

However, I agree that this is more subtle than I'm happy with. We
already have the notion of layout-compatible types, but T and const T
aren't layout-compatible (C++14[basic.types]p11). What would go wrong
if we said that cv-qualified versions of T were layout-compatible with
T? Then C++14[class.mem]p18 would allow your union code to work, since
the common initial sequence of your two pair<> structs would cover the
whole objects. (well, depending some on the meaning of "inspect"; it
would be nice to avoid such unclear terms)


Received on 2013-07-24 19:27:45