sg20: Re: [SG20] Difficulties in teaching the use of C++20 concepts

From: Martin Beeger <martin.beeger_at_[hidden]>
Date: Sun, 15 Dec 2019 14:04:31 +0100

Hello everyone.

A big thank you for the very insightful and quite long responses. I will
try to respond to all of them, but this will take me some time.

Am 15.12.19 um 03:36 schrieb Arthur O'Dwyer:
> On Sat, Dec 14, 2019 at 7:44 PM Martin Beeger via SG20
> <sg20_at_[hidden] <mailto:sg20_at_[hidden]>> wrote:
>
> Concept will IMHO change how the think about template arguments and
> template function radically, and will give us the power to express
> semantic contracts much more clearly in definition.
>
>
> Well, IMHO the opposite. :) Ever since Stepanov in the mid-1990s, C++
> programmers have known the general outlines of "generic programming":
> that in C++ it's based on templates, that templates impose syntactic
> and semantic requirements on their type parameters, and that a bundle
> of type requirements can be called a "concept."
> C++11 allows us to write "concepts" fairly conveniently in the form of
> type-traits.
> Orthogonally, C++98 allows us to write function templates that SFINAE
> away for types that don't match their preferred concept (in the form
> of `enable_if`).

I had a really good time reading "From Mathematics to Generic
Programming" by Stepanov last year, which then motivated me to read most
of "Elements of Programming" and although I somewhat knew about generic
programming and Iterator concepts before, it really changed the way I
think about writing templates. Armed with that knowlegde I tried to
improve code quality by using enable_if as a concepts replacement. I
first focussed on some key places, where to most lifting errors to
compile time could be done. The results were, that it indeed catches
error, but compile time explodes. It basically left us with the
unfortunate choice of enable_if(ing) or code or compiling our code, but
not both. I expect to be better off when Modules arrive, but until then,
that in reality. And getting coworkers to think in terms of concepts
when they are not at all enforced and completely invisible is hard.

> C++2a adds new, shorter syntax to accomplish these tasks in fewer
> keystrokes, but it doesn't change how we should /*think*/ about
> generic programming. The mere existence of shorter syntax should not
> lead us to use templates more often, nor to use SFINAE more often.

Interestingly Stepanovs book lead me already before concepts arrived to
use templates less and localize them, by defining (or including from
abseil) semantic types and program against valid semantic non-template
types like duration instead of having templates and overload sets for
different duration types (double seconds, int64_t ticks etc.).

>
> But as one uses
> concepts to modernize template functions, one quickly realizes that
> nearly all template functions either have design flaws or have
> constraints on template parameters which can be expressed by concepts.
>
>
> Yes, there is no such thing as a "truly generic" template. A generic
> function does things to its arguments in terms of their capabilities
> (properties, affordances). A function that says "I compare my
> arguments" requires that its arguments be comparable (using the <
> syntax). A function that says "I swap my arguments" requires that its
> arguments be swappable (using some syntax). So, yes, all function
> templates impose (syntactic) requirements on their type parameters.
> Any syntactic requirement can be expressed via C++2a Concepts, or via
> plain old C++11.
> So, yes, every template imposes syntactic requirements on its type
> parameters which can be expressed by Concepts.
> Equivalently, every template imposes syntactic requirements on its
> type parameters which can be expressed /*without */Concepts.
>
> If you have templates in your codebase today that are not
> SFINAE-friendly — that are not std::enable_if'ed on some type-trait
> that describes their syntactic requirements — then you should ask
> yourself, "Why?" Why would a C++11 programmer occasionally write an
> "unconstrained" template, even as he knows that it must impose some
> kind of requirement and that the requirement must be expressible in C++?

As I said above: compile time. enable_if in about 20 core places doubles
my compile time. type_traits and enable if template metaprogramming
costs a lot.

>
> Off the top of my head:
> (1) The constraint is so complicated that it's not worth the effort to
> specify. For example, imagine the variadic version of this template:
> template<class F, class T1, class T2, class T3>
> bool all_permutations(F f, T1 t, T2 u, T3 v) {
> return f(t,u,v) && f(t,v,u) && f(u,t,v) && f(u,v,t) &&
> f(v,t,u) && f(v,u,t);
> }
> It's probably possible to work out the C++ equivalent of "F is
> callable with all permutations of Ts...", but the cost/benefit
> analysis is against it.

This smells to me. You are doing something complicated, but general. So
try to find an implementation elsewhere. Does the STL have it? Does
abseil have it? Does hana have it? If not, is your original problem
perhaps better seen in a different way? It may be the right thing to do,
but sth. being a smell means for me "a reviewer stumbling upon should
have some questions and some discussions" and "if that doesn't help,
there should a comment explaining that you thought about it", not "you
should never merge that".

Also I think the permutation stuff (at least very similar) is in
Stepanovs book and I think he hard concepts for it (permutations are the
basis of permuation-based sorting). So while I get your point, you might
choose an example where Stepanov hasn't got you covered IIRC.

> (2) The function does not need to mess with overload resolution, and
> it is so trivial that its type-requirement is no more or less than
> that its body compiles. For example:
> template<class T, class U> bool equals(T t, U u) { return t == u; }

This function is just a wrapper. You rename equality and inherit all the
requirements. C++ hat no way to inherit/lift concepts, so yeah wrapper
functions are a good answer to my questions. Your function requires
copyability on top, but that can be fixed. There is a class of
"semantically inactive" functions, like wrapper functions which
themselves to not have concept requirements. I did not see that, thank you.

> The error message "can't find a suitable operator== for t == u" is
> going to be just as useful to the client programmer as "can't find a
> suitable equals for equals(x, y)".
> (3) The function does not need to mess with overload resolution, and
> for usability's sake, it prefers to static_assert its
> type-requirements rather than SFINAE away. For example:
> template<class Socket> void read_from_socket(Socket s) {
> static_assert(is_socket_v<Socket>); ... }

As for such a function the definition and declaration would usually
seperate (as it is not a one-liner) and if I program against it reading
its declaration, I will use them wrong. Yes, the compiler shouts at me,
which is good, but annoying. In a review I would question the usuability
gain, looking a the actual usage of the function and other overloads.

> (4) The compile-time cost of SFINAE is so large that the codebase has
> guidelines suggesting "no unnecessary SFINAE." Notice that many
> templates are used as implementation details. For example:
> template<class Iter> void sort_helper(Iter a, Iter b,
> std::random_access_tag) { ... }
> template<class Iter> requires
> is_swappable_v<iterator_value_t<Iter>> void sort(Iter a, Iter b) {
> sort_helper(a, b, iterator_category_t<Iter>{}); }
> I think most template programmers would agree that it would be
> counterproductive to repeat the same constraints on `sort_helper` that
> were already applied on `sort`. For one thing, it violates the Don't
> Repeat Yourself principle; but more importantly, it costs compile time
> and brings no benefit in return.

This is again a case of not repeating conecpts in semantically inactive
wrapper functions, I get that. This specific wrapper function should no
longer be needed with C++20, am I right?

>
> This lead us to the conclusion, that one should have a coding
> guidelines
> that flags unconstrained template parameters as a code smell.
>
>
> I think this is viable for small codebases, but I suspect a
> significant fraction of large codebases would run into #4 above if
> they tried to add constraints on /all/ their templates.

A significantly large codebase (where "significantly" is up for debate)
can only afford a limited number of template function instantiations.
Otherwise its compile time will blow up anyway. I would rather afford a
few less template functions, with well defined named concepts instead of
a larger number of template function w/o them. Hopefully modules can
alleviate some of this tension.

>
> We really struggled to find one by looking though our codebase.
> The plus
>
> function came to mind, but this requires the arguments to form a
> magma
> at least, which is definitively a valid requires clause.
>
>
> The plus function obviously requires `t + t` to be well-formed, which
> is a valid requires-clause.
> FWIW, I dispute your claim that this has anything to do with
> mathematical magmas. A magma is a set (i.e. a type) with a binary
> operation (let's spell it `operator+`) which is closed; I interpret
> that in a C++ context to mean that decltype(t+t) ought to be T, or at
> least something explicitly convertible to T. But that's definitely not
> a requirement imposed by the `plus` function! Not if it's defined as
> template<class T> auto plus(T t, T u) { return t + u; }
> anyway. However, I suggest we don't pursue /*this*/ rabbit-hole. We
> can pursue the next one if you like. :)

Oh yeah, there is a deep rabbit hole about whether is is benefical to
have plus over a closed set separate from plus over an open set. And how
to express commutativity here... Lets not go there.

>
> The next
> candidate was the identity function, but for the identity function
> to be
> correct, its result must be identical to the input, which only has
> valid
> meaning if the input and the output is actually equality comparable,
> which is a valid requires clause.
>
>
> Here I absolutely disagree. The `identity` function
> template<class T> T identity(T t) { return t; }
> requires that T be move-constructible and destructible, but absolutely
> does not require that T be equality-comparable. For example,
> `identity` remains a useful operation even when T is
> `std::function<void()>`. `std::function<void()>` is not, and cannot
> ever be made, equality-comparable.
>
> The type-requirements of a generic algorithm are driven by what the
> algorithm /*actually does*/. It requires what it /*does*/ to be
> well-formed. My kneejerk impression is that no generic algorithm
> should ever "require" some capability of its type parameter if its
> implementation does not depend on that capability.
But for a non-comparable type template<class T> T identity(T t) { return
T {}; } is often a faster implementation. It may not be what the
programmer expected, but how to write test? Equality is a postcondition
of the type. As the postcondition is impossible to verify, semantics
meaning of the function gets lost and the function itself gets
misleading. If you wanted a "semantically inactive" no-op function,
state that an name it appropiuately. Identity has a mathematical meaning
that different from no-op. I think the identity function on
non-comparable types would not be "provable" in Lisa Lippincott's sense.
So I personally claim it should not be written that way.
> To take a really extreme and probably singular example, imagine what
> the STL would have been like if we'd said, "The std::copy algorithm
> requires the expression `std::addressof(*dest)` to have pointer type."
> That's "obvious," right, because the point of std::copy is to be a
> type-safe replacement for memcpy, and objects always have addresses?
> ...But then we never would have gotten ostream_iterator or
> back_insert_iterator!

The copy function is defined without mention of memcpy. A copy function
is a function which takes Range T and produces a new range T2 where the
following two postconditions hold: The range T is unchanged and the
range is T2 equal to the original range T. And if adressof(*dest) is not
a pointer, the function can still reach its postconditions. So this is
not really a requirement. memcpy-ability or relocatability (yes I have
seen your talk :)) how you call it, may be a requirement of candidate
within the overload set of std::copy to select the fastest algorithm but
never a condition on the copy overload set as a whole, because such a
set I would call copy_to_pointer_location.

>
> I suspect that a sufficiently motivated programmer could always find a
> /*semantic*/ requirement that you failed to capture in your naming
> scheme. ;) Even something as simple as "fclose(fp) expects fp to be a
> file that is open, not a file that has already been closed." But
> maybe it's domain-dependent; maybe "no hidden semantic requirements"
> actually works in your particular codebase.

I agree. This is a Sean-Parent-esk "goal". You will never reach it, but
dude, how much cool stuff you can get from trying and even getting part
of the way there.

If fp was an RAII object which invariant was: for the object to exist, a
open file must exist, the semantic requirement of fclose would be to get
a valid FP object of which its invariant is satifisfied, et voila, you
are done. You cannot correct missing invariant on the type definition of
a function parameter at the function declaration. Get your types right
and RAII for the win.

> [...]
>
> preferred over auto, contains a semantically meaningless auto! A rule
> along the lines of "just avoid auto" is much simpler than to teach
> than
> a rule along the lines of "just avoid auto, except when there is this
> confusing thing which spells auto but is not really a semantic
> auto, as
> it will not match what auto normally matches". So a a result, this
> shorthand notion tends to be avoided out of misinterpretation of the
> guidelines, which then makes concepts unnecessarily hard to use.
>
>
> I do think it's important to teach that (as of C++14) there are two
> different possible meanings for "auto" depending on where it appears
> in the program. When I write
>
> auto int f() { auto i = 0; for (auto&& elt : vec) ++elt; return i; }
> template<class T> void g() { auto x = T(); }
>
> all four of those "auto"s mean the same thing: "There is one single
> concrete type that I have in mind, and I am just too lazy to write it
> out. Compiler, please infer it for me." (In the last case, the
> concrete type differs in each specific template instantiation, but
> it's always the same as `T`.)
> In contrast, when I write
>
> std::sort(first, last, [&](auto&& a, auto&& b) { return a < b; });
> or in C++2a
> void h(auto x) { static int y = 0; std::cout << x << ++y; }
>
> there the "auto"s mean something vastly different: "This thing is a
> template. There is no one concrete type I have in mind; please
> generate a template that can be instantiated for many different things."
> Importantly, since `h` is a template, there are many different `y`s —
> one per instantiation — not just one `y`.

I will go into more detail in the response to Herb Sutters mail, which
also covers this.

>
> For this reason, I conservatively suggest avoiding the `void
> func(Iterator auto it)` syntax that you dislike, and preferring an
> old-fashioned
> template<Iterator It>
> void func(It it) { ... }
> or even
> template<class It> requires Iterator<It>
> void func(It it) { ... }
> which makes it clear that you /*want*/ a template and you /*want*/ it
> to SFINAE away when the parameter type wouldn't be an Iterator.
This is worth some experimentation. Thank you.
>
>
> I can understand that a syntax of "void func(ForwardIterator it)"
> raised
> some eyebrows, but what is the reasoning behind preferring a
> syntax of
> "void func(ForwardIterator auto it)" over alternatives like "void
> func(concept ForwardIterator it)" or similar? Or am I missing a
> way to
> consistently tell this story and not confuse people?
>
>
> I don't think `void func(concept ForwardIterator it)` was ever proposed.
> ("If you didn't want the product to be bad then why didn't you write a
> paper proposing that the product be good?" ;))
I value understanding the current situation before one tries to change
it. So I deemed a mail to a mailing list a better start than a paper.
And as I found the misunderstanding to be on my end, this was the right
choice.
> However, `concept Foo` is also more typing than `Foo auto`, so I don't
> think it's an improvement.
> Right now, `auto x` in a lambda parameter list means "secretly a
> template," so it was the obvious choice when secret templates were
> extended to work with plain old functions. It was a GCC extension
> long before it was in C++2a.
> (In fact, just last week I had to explain to an actual beginner
> student that one of his C++14 program's `auto`s was legit and the
> other was exploiting a non-standard GCC extension! Stupid GCC...)
Herbs mail told me what I am missing and my lack of knowledge over the
final version of concepts did to lead to that confusion on my end. Sorry.
>
> Has this been talked about or what is the general plan to get our
> community to move away from unconstrained template parameters and
> auto?
>
>
> - Whose community (yours in particular, or postulating some global C++
> community)?
> - /*Should*/ any community move away from unconstrained template
> parameters, and if so, why?
>
> If you don't like the visual similarity of
> void foo(Iterator auto x) // TWO WORDS GOOD
> void foo(auto x) // ONE WORD BAD
> then what would you say to
> template<Iterator It> void foo(It x) // NO CLASS GOOD
> template<class It> void foo(It x) // CLASS BAD
> ?
>
> I notice that `template<class It> void foo(It x)` doesn't contain the
> dreaded word "auto", but presumably you'd still consider it a "bad"
> unconstrained template, right? So it's not exactly the word "auto"
> that's causing you trouble...?

The game-changer here is actually the simple line in Herbs mail:

iterator auto first = vec.begin()

I did not know that concepts could also be used to constrain auto in
immediately deduced concept like statements, not only in later
instantiated template contexts. And as both were called auto, this
should have been somewhat obvious to me but it wasn't. In my mind,
concepts could only be used to constrain templates, and I never thought
of auto it= vec.begin(); as a template, but as a deduced type. So I did
not expect you to be able to constrain it, and worried about the task to
disambigaute between these two kinds.

Now everything is beautiful: The story to tell is just use auto, but
when you do and you have a named concept in mind and at hand, constrain
your auto by using the concept. You do that, by simply writing the
constraint before the auto. If that works for all auto contexts and no
asterisk is needed, this is really beautiful.

Martin

Received on 2019-12-15 07:07:00