On Sun, Sep 27, 2020 at 4:16 PM Giuseppe D'Angelo <giuseppe.dangelo@kdab.com> wrote:

Il 25/09/20 05:01, Arthur O'Dwyer ha scritto:
> You propose a new library function to do a thing /*less
> efficiently*/ than the programmer could do it by hand. I think you
> should look for an implementation strategy that lets you produce the
> best possible codegen.
[snip]
> However, I think this is all of questionable relevance, because capacity
> is not a /*salient*/ feature of std::string.

This criticism is more than fair, and I thank you for it. I think the
point(s) raised here boil down to a few key observations:

1) Is the proposed implementation the "best" available for a generic
type T of which we know nothing about?

A: I'd say a humble "yes" here.

"Yes" in the sense that if all you know about T is that it is move-constructible, move-assignable, and default-constructible, then yes.

But if you additionally know that T supplies a `.clear()` method, or a `.reset()` method, then you may be able to do better.

And, if you don't even know whether T supports move-assignment, then you may not be able to do as well as you think!

It's all about concepts. :) You have an algorithm you want to write; figure out what primitive operations T must provide in order to implement that algorithm efficiently. The lower-level your algorithm is, the more likely it is to just be a primitive in its own right. That's what we ended up doing with swap, for example.

2) Does such an implementation produce the optimal code for all T?
Does it only carry "salient state" (to quote the above)? Can an author
provide a better implementation for their classes?

A: The evidence suggests that an author can do better.

Thinking about it, take() is very similar to swap() in this regard. The
default swap() does a series of steps (1 move construction, 2 move
assignments, 1 destruction of a moved from object) that we recognize can
be suboptimal for a number of types and/or the compiler cannot easily
(or possibly) optimize it to the same degree where the class author can.

Yes, I like this analogy.

See also my P1144 "Object relocation by move plus destroy," which defines "relocate" in terms of a series of steps (1 move construction, 1 destruction).

[...] So, this points in the direction of making take() a
customization point (object).

Assignment from take() is a bit more tricky, but points in the same
direction. The fundamental difference between

list2 = take(list1);

and

list2 = std::move(list1);

is that this op= acts on both list1 and list2, while take doesn't know
about list2. So I need a new operation here -- take_assign(target,
source) or somesuch -- to make it be as efficient as a move-assignment /
swap+clear / etc. And this should be a CPO as well. With this available,
we get the desired performance back "out of the box"...

Agreed. But adding new customization points doesn't scale. Even `swap` is cumbersome to write — so cumbersome that when we added hashing in C++11, we didn't use the `swap` model but instead made up a different model. And then when we added three-way comparison in C++20, we didn't do it like `swap` nor like `hash`, but instead built it in as a completely new operator (and it's still cumbersome to use correctly).

See my blog post: https://quuxplusone.github.io/blog/2018/08/23/customization-point-or-named-function-pick-one/

However, yes, I am now convinced that you've grokked the essential point I was trying to make about performance.

One final warning about customization points: As soon as you let the user customize the operation, you have to document for them what you intend for them to do with it. Look at C++11 "move constructor" for the cautionary tale here. Syntactically it's a work of beauty. Semantically, though, nobody can ever agree on exactly what it means for T to be "move-constructible" or what state a moved-from T should be left in. You're going to have exactly the same kind of problem with your "std::take": is it okay to leave std::string with some capacity, as long as it compares equal to a default-constructed string? Is it okay to simply copy a trivially copyable type, such as `int`, or must I still reset the source `int` to `0`? (If you said it was okay to trivially copy an `int`: well then, is it okay to trivially copy an `optional<int>`?)

Documentation is going to be really important here, because the majority of your motivation is "creating a new idiom where we know what it does." So, make sure you still know what it does, even after opening it up as a customization point!

(Again, compare with my P1144 "Object relocation," which makes "relocate" a new verb but does not open it up as a customization point.)

HTH,

Arthur