sg12: Re: [ub] [c++std-core-23844] Re: unions and undefined behavior

From: Howard Hinnant <howard.hinnant_at_[hidden]>
Date: Wed, 24 Jul 2013 13:51:24 -0400

On Jul 24, 2013, at 1:27 PM, Jeffrey Yasskin <jyasskin_at_[hidden]> wrote:

> On Wed, Jul 24, 2013 at 8:43 AM, Howard Hinnant
> <howard.hinnant_at_[hidden]> wrote:
>> On Jul 23, 2013, at 10:36 PM, Lawrence Crowl <Lawrence_at_[hidden]> wrote:
>>
>>>> This isn't an academic argument. If we can keep the compiler from
>>>> optimizing away the following example code (which has UB from what I've been
>>>> reading here), we make some parts of the std::library components 2 to 3
>>>> times faster. That's a much better optimization than what the compilers are
>>>> getting from actively exploiting UB.
>>>>
>>>> #include <utility>
>>>> #include <string>
>>>>
>>>> template <class T, class U>
>>>> struct pair
>>>> {
>>>> T first;
>>>> U second;
>>>> };
>>>>
>>>> template <class T, class U>
>>>> union V
>>>> {
>>>> typedef pair< T, U> NPair;
>>>> typedef pair<const T, U> CPair;
>>>> NPair n;
>>>> CPair c;
>>>>
>>>> V() : c() {}
>>>> V(const T& t) : c{t} {}
>>>> V(const T& t, const U& u) : c{t, u} {}
>>>> V(const V& v) : c(v.c) {}
>>>> V& operator=(const V& v) {n = v.n; return *this;}
>>>> V(V&& v) : n(std::move(v.n)) {}
>>>> V& operator=(V&& v) {n = std::move(v.n); return *this;}
>>>> ~V() {c.~CPair();}
>>>> };
>>>>
>>>> struct X
>>>> {
>>>> std::string s;
>>>> };
>>>>
>>>> struct Y
>>>> {
>>>> std::string s;
>>>> };
>>>>
>>>> int
>>>> main()
>>>> {
>>>> typedef V<X, Y> Z;
>>>> Z z1(X{"some "}, Y{"text"});
>>>> Z z2 = std::move(z1);
>>>> }
>>>
>>> Can you explain your reasoning here?
>>>
>>> BTW, I think there is potential optimization in knowing that a value
>>> will not change, and so writing into a const variable strikes me as
>>> being locally optimistic and globally pessimistic.
>>
>> This is a crude model of std::map<T, U>::value_type that would allow map's copy assignment operator to recycle nodes, instead of deleting all nodes in the lhs prior to inserting new ones. Here is a an example chart of the performance of the copy assignment of a map<int, int> when the rhs has 1000 values, the lhs has between 0 and 2000 values (prior to the assignment), and the implementation is varied between a simple clear()/insert(), a structural copy (avoids rebalancing the lhs at each insert), and recycling nodes using something very similar to what is sketched above with the union V.
>>
>
> Thanks for the concrete use case.
>
> There's always the hack of destroying and re-constructing a node when
> you want to re-use its memory without going through the global
> allocator. Then you have to figure out what you need to do with
> pointers to keep them valid. e.g. is "p->~T(); new(p) T(...); use(p);"
> valid, or must one write "p->~T(); p = new(p) T(...); use(p);"? (note
> the extra "p=") [basic.life]p7 addresses this ... and says that for
> class types with a const-qualified data member, you do need to
> re-assign the pointer. (C++03 through C++14)
>
> Can you try that approach and see if the no-op assignments cost
> measurable performance? If they do, probably a clang or llvm bug would
> be in order.

This approach would work fine for map<int, int>. However it is not acceptable for map<string, int> or map<vector, int>. The copy assignment operator for string and vector will recycle (reuse) existing capacity on the lhs, making copy assignment for these types, in general, far faster than a destruction/construction cycle. Said another way, the chart I posted was the most pessimistic example. Though I haven't run the experiment yet (probably should before I say this!), if I were to run the experiment with string (long strings) or vector in the map, I believe the results would be much more dramatic, as you're not only saving a dealloc/alloc of the node, but often of the string or vector's data buffer as well.

> However, I agree that this is more subtle than I'm happy with. We
> already have the notion of layout-compatible types, but T and const T
> aren't layout-compatible (C++14[basic.types]p11). What would go wrong
> if we said that cv-qualified versions of T were layout-compatible with
> T? Then C++14[class.mem]p18 would allow your union code to work, since
> the common initial sequence of your two pair<> structs would cover the
> whole objects. (well, depending some on the meaning of "inspect"; it
> would be nice to avoid such unclear terms)

This sounds like a promising direction, thanks.

One other thing we would need to say (in the library, not the language) is that clients are forbidden from specializing pair in such a way that pair<const T, U> has a different layout than pair<T, U>. But that seems like a very reasonable restriction given the dramatic performance gains such a restriction would unleash.

Howard

Received on 2013-07-24 19:51:22