Date: Wed, 24 Jul 2013 10:27:24 -0700
On Wed, Jul 24, 2013 at 8:43 AM, Howard Hinnant
<howard.hinnant_at_[hidden]> wrote:
> On Jul 23, 2013, at 10:36 PM, Lawrence Crowl <Lawrence_at_[hidden]> wrote:
>
>>> This isn't an academic argument. If we can keep the compiler from
>>> optimizing away the following example code (which has UB from what I've been
>>> reading here), we make some parts of the std::library components 2 to 3
>>> times faster. That's a much better optimization than what the compilers are
>>> getting from actively exploiting UB.
>>>
>>> #include <utility>
>>> #include <string>
>>>
>>> template <class T, class U>
>>> struct pair
>>> {
>>> T first;
>>> U second;
>>> };
>>>
>>> template <class T, class U>
>>> union V
>>> {
>>> typedef pair< T, U> NPair;
>>> typedef pair<const T, U> CPair;
>>> NPair n;
>>> CPair c;
>>>
>>> V() : c() {}
>>> V(const T& t) : c{t} {}
>>> V(const T& t, const U& u) : c{t, u} {}
>>> V(const V& v) : c(v.c) {}
>>> V& operator=(const V& v) {n = v.n; return *this;}
>>> V(V&& v) : n(std::move(v.n)) {}
>>> V& operator=(V&& v) {n = std::move(v.n); return *this;}
>>> ~V() {c.~CPair();}
>>> };
>>>
>>> struct X
>>> {
>>> std::string s;
>>> };
>>>
>>> struct Y
>>> {
>>> std::string s;
>>> };
>>>
>>> int
>>> main()
>>> {
>>> typedef V<X, Y> Z;
>>> Z z1(X{"some "}, Y{"text"});
>>> Z z2 = std::move(z1);
>>> }
>>
>> Can you explain your reasoning here?
>>
>> BTW, I think there is potential optimization in knowing that a value
>> will not change, and so writing into a const variable strikes me as
>> being locally optimistic and globally pessimistic.
>
> This is a crude model of std::map<T, U>::value_type that would allow map's copy assignment operator to recycle nodes, instead of deleting all nodes in the lhs prior to inserting new ones. Here is a an example chart of the performance of the copy assignment of a map<int, int> when the rhs has 1000 values, the lhs has between 0 and 2000 values (prior to the assignment), and the implementation is varied between a simple clear()/insert(), a structural copy (avoids rebalancing the lhs at each insert), and recycling nodes using something very similar to what is sketched above with the union V.
>
Thanks for the concrete use case.
There's always the hack of destroying and re-constructing a node when
you want to re-use its memory without going through the global
allocator. Then you have to figure out what you need to do with
pointers to keep them valid. e.g. is "p->~T(); new(p) T(...); use(p);"
valid, or must one write "p->~T(); p = new(p) T(...); use(p);"? (note
the extra "p=") [basic.life]p7 addresses this ... and says that for
class types with a const-qualified data member, you do need to
re-assign the pointer. (C++03 through C++14)
Can you try that approach and see if the no-op assignments cost
measurable performance? If they do, probably a clang or llvm bug would
be in order.
However, I agree that this is more subtle than I'm happy with. We
already have the notion of layout-compatible types, but T and const T
aren't layout-compatible (C++14[basic.types]p11). What would go wrong
if we said that cv-qualified versions of T were layout-compatible with
T? Then C++14[class.mem]p18 would allow your union code to work, since
the common initial sequence of your two pair<> structs would cover the
whole objects. (well, depending some on the meaning of "inspect"; it
would be nice to avoid such unclear terms)
Jeffrey
<howard.hinnant_at_[hidden]> wrote:
> On Jul 23, 2013, at 10:36 PM, Lawrence Crowl <Lawrence_at_[hidden]> wrote:
>
>>> This isn't an academic argument. If we can keep the compiler from
>>> optimizing away the following example code (which has UB from what I've been
>>> reading here), we make some parts of the std::library components 2 to 3
>>> times faster. That's a much better optimization than what the compilers are
>>> getting from actively exploiting UB.
>>>
>>> #include <utility>
>>> #include <string>
>>>
>>> template <class T, class U>
>>> struct pair
>>> {
>>> T first;
>>> U second;
>>> };
>>>
>>> template <class T, class U>
>>> union V
>>> {
>>> typedef pair< T, U> NPair;
>>> typedef pair<const T, U> CPair;
>>> NPair n;
>>> CPair c;
>>>
>>> V() : c() {}
>>> V(const T& t) : c{t} {}
>>> V(const T& t, const U& u) : c{t, u} {}
>>> V(const V& v) : c(v.c) {}
>>> V& operator=(const V& v) {n = v.n; return *this;}
>>> V(V&& v) : n(std::move(v.n)) {}
>>> V& operator=(V&& v) {n = std::move(v.n); return *this;}
>>> ~V() {c.~CPair();}
>>> };
>>>
>>> struct X
>>> {
>>> std::string s;
>>> };
>>>
>>> struct Y
>>> {
>>> std::string s;
>>> };
>>>
>>> int
>>> main()
>>> {
>>> typedef V<X, Y> Z;
>>> Z z1(X{"some "}, Y{"text"});
>>> Z z2 = std::move(z1);
>>> }
>>
>> Can you explain your reasoning here?
>>
>> BTW, I think there is potential optimization in knowing that a value
>> will not change, and so writing into a const variable strikes me as
>> being locally optimistic and globally pessimistic.
>
> This is a crude model of std::map<T, U>::value_type that would allow map's copy assignment operator to recycle nodes, instead of deleting all nodes in the lhs prior to inserting new ones. Here is a an example chart of the performance of the copy assignment of a map<int, int> when the rhs has 1000 values, the lhs has between 0 and 2000 values (prior to the assignment), and the implementation is varied between a simple clear()/insert(), a structural copy (avoids rebalancing the lhs at each insert), and recycling nodes using something very similar to what is sketched above with the union V.
>
Thanks for the concrete use case.
There's always the hack of destroying and re-constructing a node when
you want to re-use its memory without going through the global
allocator. Then you have to figure out what you need to do with
pointers to keep them valid. e.g. is "p->~T(); new(p) T(...); use(p);"
valid, or must one write "p->~T(); p = new(p) T(...); use(p);"? (note
the extra "p=") [basic.life]p7 addresses this ... and says that for
class types with a const-qualified data member, you do need to
re-assign the pointer. (C++03 through C++14)
Can you try that approach and see if the no-op assignments cost
measurable performance? If they do, probably a clang or llvm bug would
be in order.
However, I agree that this is more subtle than I'm happy with. We
already have the notion of layout-compatible types, but T and const T
aren't layout-compatible (C++14[basic.types]p11). What would go wrong
if we said that cv-qualified versions of T were layout-compatible with
T? Then C++14[class.mem]p18 would allow your union code to work, since
the common initial sequence of your two pair<> structs would cover the
whole objects. (well, depending some on the meaning of "inspect"; it
would be nice to avoid such unclear terms)
Jeffrey
Received on 2013-07-24 19:27:45