sg20: Re: [SG20] Difficulties in teaching the use of C++20 concepts

From: Arthur O'Dwyer <arthur.j.odwyer_at_[hidden]>
Date: Sat, 14 Dec 2019 21:36:18 -0500

On Sat, Dec 14, 2019 at 7:44 PM Martin Beeger via SG20 <
sg20_at_[hidden]> wrote:

> Hello everyone!
>
> I recently talked a lot with coworkers about C++20 and we tried together
> to get a grasp around what its effective use is and what good coding
> guidelines around it are.
>
> A very fundamental change in C++20 is that Concepts are finally in (yay!).
>
> Concept will IMHO change how the think about template arguments and
> template function radically, and will give us the power to express
> semantic contracts much more clearly in definition.

Well, IMHO the opposite. :) Ever since Stepanov in the mid-1990s, C++
programmers have known the general outlines of "generic programming": that
in C++ it's based on templates, that templates impose syntactic and
semantic requirements on their type parameters, and that a bundle of type
requirements can be called a "concept."
C++11 allows us to write "concepts" fairly conveniently in the form of
type-traits.
Orthogonally, C++98 allows us to write function templates that SFINAE away
for types that don't match their preferred concept (in the form of
`enable_if`).
C++2a adds new, shorter syntax to accomplish these tasks in fewer
keystrokes, but it doesn't change how we should *think* about generic
programming. The mere existence of shorter syntax should not lead us to use
templates more often, nor to use SFINAE more often.

But as one uses
> concepts to modernize template functions, one quickly realizes that
> nearly all template functions either have design flaws or have
> constraints on template parameters which can be expressed by concepts.
>

Yes, there is no such thing as a "truly generic" template. A generic
function does things to its arguments in terms of their capabilities
(properties, affordances). A function that says "I compare my arguments"
requires that its arguments be comparable (using the < syntax). A function
that says "I swap my arguments" requires that its arguments be swappable
(using some syntax). So, yes, all function templates impose (syntactic)
requirements on their type parameters.
Any syntactic requirement can be expressed via C++2a Concepts, or via plain
old C++11.
So, yes, every template imposes syntactic requirements on its type
parameters which can be expressed by Concepts.
Equivalently, every template imposes syntactic requirements on its type
parameters which can be expressed *without *Concepts.

If you have templates in your codebase today that are not SFINAE-friendly —
that are not std::enable_if'ed on some type-trait that describes their
syntactic requirements — then you should ask yourself, "Why?" Why would a
C++11 programmer occasionally write an "unconstrained" template, even as he
knows that it must impose some kind of requirement and that the requirement
must be expressible in C++?

Off the top of my head:
(1) The constraint is so complicated that it's not worth the effort to
specify. For example, imagine the variadic version of this template:
    template<class F, class T1, class T2, class T3>
    bool all_permutations(F f, T1 t, T2 u, T3 v) {
        return f(t,u,v) && f(t,v,u) && f(u,t,v) && f(u,v,t) && f(v,t,u) &&
f(v,u,t);
    }
It's probably possible to work out the C++ equivalent of "F is callable
with all permutations of Ts...", but the cost/benefit analysis is against
it.
(2) The function does not need to mess with overload resolution, and it is
so trivial that its type-requirement is no more or less than that its body
compiles. For example:
    template<class T, class U> bool equals(T t, U u) { return t == u; }
The error message "can't find a suitable operator== for t == u" is going to
be just as useful to the client programmer as "can't find a suitable equals
for equals(x, y)".
(3) The function does not need to mess with overload resolution, and for
usability's sake, it prefers to static_assert its type-requirements rather
than SFINAE away. For example:
    template<class Socket> void read_from_socket(Socket s) {
static_assert(is_socket_v<Socket>); ... }
(4) The compile-time cost of SFINAE is so large that the codebase has
guidelines suggesting "no unnecessary SFINAE." Notice that many templates
are used as implementation details. For example:
    template<class Iter> void sort_helper(Iter a, Iter b,
std::random_access_tag) { ... }
    template<class Iter> requires is_swappable_v<iterator_value_t<Iter>>
void sort(Iter a, Iter b) { sort_helper(a, b, iterator_category_t<Iter>{});
}
I think most template programmers would agree that it would be
counterproductive to repeat the same constraints on `sort_helper` that were
already applied on `sort`. For one thing, it violates the Don't Repeat
Yourself principle; but more importantly, it costs compile time and brings
no benefit in return.

This lead us to the conclusion, that one should have a coding guidelines
> that flags unconstrained template parameters as a code smell.

I think this is viable for small codebases, but I suspect a significant
fraction of large codebases would run into #4 above if they tried to add
constraints on *all* their templates.
Perhaps you could eliminate that point of contention by saying something
like "unconstrained template parameters *in public APIs* are a code
smell"... but then good luck explaining to the compiler what counts as a
"public API." ;)

We really struggled to find one by looking though our codebase. The plus
>
function came to mind, but this requires the arguments to form a magma
> at least, which is definitively a valid requires clause.

The plus function obviously requires `t + t` to be well-formed, which is a
valid requires-clause.
FWIW, I dispute your claim that this has anything to do with mathematical
magmas. A magma is a set (i.e. a type) with a binary operation (let's spell
it `operator+`) which is closed; I interpret that in a C++ context to mean
that decltype(t+t) ought to be T, or at least something explicitly
convertible to T. But that's definitely not a requirement imposed by the
`plus` function! Not if it's defined as
    template<class T> auto plus(T t, T u) { return t + u; }
anyway. However, I suggest we don't pursue *this* rabbit-hole. We can
pursue the next one if you like. :)

The next
> candidate was the identity function, but for the identity function to be
> correct, its result must be identical to the input, which only has valid
> meaning if the input and the output is actually equality comparable,
> which is a valid requires clause.
>

Here I absolutely disagree. The `identity` function
    template<class T> T identity(T t) { return t; }
requires that T be move-constructible and destructible, but absolutely does
not require that T be equality-comparable. For example, `identity` remains
a useful operation even when T is `std::function<void()>`.
`std::function<void()>` is not, and cannot ever be made,
equality-comparable.

The type-requirements of a generic algorithm are driven by what the
algorithm *actually does*. It requires what it *does* to be well-formed. My
kneejerk impression is that no generic algorithm should ever "require" some
capability of its type parameter if its implementation does not depend on
that capability.
To take a really extreme and probably singular example, imagine what the
STL would have been like if we'd said, "The std::copy algorithm requires
the expression `std::addressof(*dest)` to have pointer type." That's
"obvious," right, because the point of std::copy is to be a type-safe
replacement for memcpy, and objects always have addresses? ...But then we
never would have gotten ostream_iterator or back_insert_iterator!

Se we came up for us with the "no hidden semantic requirements"-rule,
> which state that a template function should state its requirements about
> input types via concepts. If they are axioms involved, use named
> concepts to represent them. Where this leads us remains to be seen, but
> I expect the result to be far clearer code and far better
> compile-time-verification of the code.
>

I suspect that a sufficiently motivated programmer could always find a
*semantic* requirement that you failed to capture in your naming scheme. ;)
Even something as simple as "fclose(fp) expects fp to be a file that is
open, not a file that has already been closed." But maybe it's
domain-dependent; maybe "no hidden semantic requirements" actually works in
your particular codebase.

>
> This brings me to auto. auto-type arguments are just template
> parameters. A completely unconstrained template parameter, to be
> precise. While I am absolutely in favor of what we call "localized
> auto", auto in cases where the lifespan is a few lines or a single scope
> (e.g. for loops, iterators, make_unique results & co.), auto in
> interfaces obscures intent and breaks the rule of "no hidden semantics
> requirements".
>
> So we extended the guideline to: avoid auto in interfaces in C++20,
> especially with template parameters, as there now is a clear better
> alternative: using the appropriate concept. That, unfortunately didn't
> go well.
>
> The C++ Committee Members like Herb Sutter have preached "Almost always
> auto" for almost a decade, and so our guidelines seems to oppose that
> C++ guru wisdom.

Well, I dare you to name a *second* WG21 member (besides Herb Sutter) who
has preached "almost always auto." It's very easy to find gurus on the
other side of that particular issue. I don't think you should worry about
"opposing guru wisdom" in this case. Go for it!

> Also it is a lot more convenient to not have to think
> about the semantics of your function and just write auto everywhere. So
> I went looking, for an cite-able reference that says: "when both apply,
> prefer concepts over almost always auto" and did not find prominent
> references about it. Does anyone have some links of reference for me I
> can cite, which clearly state, that in a C++20 world, almost always auto
> is no longer the right thing to do? Given that a well-thought out
> concept already exists and is defined, do you think we should prefer the
> concept over auto?
>

Perhaps, but see the 4 examples above of places where you might not want to
burden a template with SFINAE.
Unconstrained "auto" in C++2a is just a synonym for a plain old template
parameter, so it would have the same pros and cons.

But even if I manage to convince that auto in interfaces has had its
> time, but from C++20 on is a smell in new code, the next problem arises.
> If we want to have a significant motion away from auto, we must clearly
> discourage its use in interfaces. But a rule "avoid auto in interfaces"
> seemed to confuse other in practice. A lot. So I tried to figure out
> what happened. The problem was: the shorthand notion for concepts was
> "void func(ForwardIterator auto it)". From a teaching standpoint it is
> incredibly confusing that the shorthand notion, which should be
> preferred over auto, contains a semantically meaningless auto! A rule
> along the lines of "just avoid auto" is much simpler than to teach than
> a rule along the lines of "just avoid auto, except when there is this
> confusing thing which spells auto but is not really a semantic auto, as
> it will not match what auto normally matches". So a a result, this
> shorthand notion tends to be avoided out of misinterpretation of the
> guidelines, which then makes concepts unnecessarily hard to use.
>

I do think it's important to teach that (as of C++14) there are two
different possible meanings for "auto" depending on where it appears in the
program. When I write

    auto int f() { auto i = 0; for (auto&& elt : vec) ++elt; return i; }
    template<class T> void g() { auto x = T(); }

all four of those "auto"s mean the same thing: "There is one single
concrete type that I have in mind, and I am just too lazy to write it out.
Compiler, please infer it for me." (In the last case, the concrete type
differs in each specific template instantiation, but it's always the same
as `T`.)
In contrast, when I write

    std::sort(first, last, [&](auto&& a, auto&& b) { return a < b; });
or in C++2a
    void h(auto x) { static int y = 0; std::cout << x << ++y; }

there the "auto"s mean something vastly different: "This thing is a
template. There is no one concrete type I have in mind; please generate a
template that can be instantiated for many different things."
Importantly, since `h` is a template, there are many different `y`s — one
per instantiation — not just one `y`.

For this reason, I conservatively suggest avoiding the `void func(Iterator
auto it)` syntax that you dislike, and preferring an old-fashioned
    template<Iterator It>
    void func(It it) { ... }
or even
    template<class It> requires Iterator<It>
    void func(It it) { ... }
which makes it clear that you *want* a template and you *want* it to SFINAE
away when the parameter type wouldn't be an Iterator.

I can understand that a syntax of "void func(ForwardIterator it)" raised
> some eyebrows, but what is the reasoning behind preferring a syntax of
> "void func(ForwardIterator auto it)" over alternatives like "void
> func(concept ForwardIterator it)" or similar? Or am I missing a way to
> consistently tell this story and not confuse people?
>

I don't think `void func(concept ForwardIterator it)` was ever proposed.
("If you didn't want the product to be bad then why didn't you write a
paper proposing that the product be good?" ;))
However, `concept Foo` is also more typing than `Foo auto`, so I don't
think it's an improvement.
Right now, `auto x` in a lambda parameter list means "secretly a template,"
so it was the obvious choice when secret templates were extended to work
with plain old functions. It was a GCC extension long before it was in
C++2a.
(In fact, just last week I had to explain to an actual beginner student
that one of his C++14 program's `auto`s was legit and the other was
exploiting a non-standard GCC extension! Stupid GCC...)

Has this been talked about or what is the general plan to get our
> community to move away from unconstrained template parameters and auto?
>

- Whose community (yours in particular, or postulating some global C++
community)?
- *Should* any community move away from unconstrained template parameters,
and if so, why?

If you don't like the visual similarity of
    void foo(Iterator auto x) // TWO WORDS GOOD
    void foo(auto x) // ONE WORD BAD
then what would you say to
    template<Iterator It> void foo(It x) // NO CLASS GOOD
    template<class It> void foo(It x) // CLASS BAD
?

I notice that `template<class It> void foo(It x)` doesn't contain the
dreaded word "auto", but presumably you'd still consider it a "bad"
unconstrained template, right? So it's not exactly the word "auto" that's
causing you trouble...?

–Arthur

P.S. — On the improbability of experts' being able to capture even
*syntactic* requirements correctly (let alone *semantic* requirements), see
https://quuxplusone.github.io/blog/2018/06/12/attribute-noexcept-verify/

Received on 2019-12-14 20:38:55