Pattern Matching focus

Document #: xxx
Date: 2022-09-22
Project: Programming Language C++
Reply-to: Mihail Naydenov

1 Abstract

This paper proposes to significantly reduce the scope of Pattern Matching (PM), aiming for release in C++26.

2 Motivation

Modern, post C++11, PM has been in development for almost a decade1,2,3. The progress was quite slow already and now with completely alternative proposal present (introducing “is” and “as” into the language4), we might and up slowing down even further. Considering our experience so far, it does not seem realistic to have a “feature complete” PM in the next 3 years. There are other major C++ features (executors, reflection) that need attention and both active proposals are admittedly very large in scope.
Part of the problem is the task at hand itself. PM is prone to feature creep - there is always “one more pattern” to add, one more case to cover. We must use this to our advantage, cutting the list to an acceptable minimum with the understanding, we can always add more things later.

3 Proposal

The following patterns are proposed as the main scope for a future PM system:

3.1 Type switch

Arguably, the most interesting usage of PM, in the context of C++, is having a uniform type switch. This was one of the main motivation to have PM in the first place5,6, as well as the fucus of development for both alternative designs. Uniform type switch is exactly the kind of feature, which makes C++ both easier and more powerful. The ability to define conditions, based on type alone and have them work uniformly across multitude of polymorphism paradigms (virtual classes, std::variant, std::any, etc) allows for a generic programming, accessible by any level of expertise, including by people outside the language.

In a way PM is pseudo-code, not tightly related to the language syntax. This can ease people with experience in languages other then C++.

Having a type switch is a top priority, but it is also the hardest features to develop, both design and implementation.

3.2 Bindings

No PM is practically possible without binding support. Without the ability to name what is matched, there is little to no way of interacting with the rest of the code. Top priority.
Further, we should make an effort to have binding work in tandem with other patterns, not just as the sole entity (the “Identifier pattern” in P1371). This is in regard to “Binding Pattern” being removed from P1371, which this paper considers unfortunate. Specifically for C++, we will often need to bind at multiple levels inside the recursive pattern structure. For example binding both the pointer and the pointee, or at the very least, bind just the pointer, but continuing the pattern matching to make sure, the pointer is valid and/or points to the desired value. Without a binding pattern neither of these are possible. Binding only the end result will not be enough.

3.3 Dereference

In the context of C++ we can’t get around of dealing with dereferencable types, be it pointers or optionals. Top priority.
It might be tempting to have automatic dereference and treat “optional” types as value-or-empty, automatically matching on value, if not empty. However, because of nesting (“pointer to pointer”, “optional of pointer”, etc), as well as the desire to have binding at any level (see prev. point), we will still need a way to denote “a pointer” or at least “a level” in order to create such bindings.

3.4 Expressions

Values are matched against expressions, no way around that fact, top priority.
However, in order to reduce the scope and leave room for future improvements, proposed is to limit the expressions to only literals and identifiers, no arbitrary expressions like a + b or c | d.

3.5 Destructuring

We already have structured bindings, understandably there is an expectation to have a form of destructurion in PM as well. High priority.
In general, destructuring is integral part PM systems, however in C++ it is arguably of lesser value because of class encapsulation. There is so much one can do with struct, and with the lack of class member properties (or some similar unified interface paradigm) destructuring will not be as central part as in other languages, hence a bit lower priority. Also, again because of heavy use of encapsulation, proposed is positional only destructuring, no designators for the initial release. What is more, we must be extremely carful to not allow designators support, without a customization point, otherwise we will damage current encapsulation practices, as people will be tempted to use public-fields-only structs just to be able to destruct them.

3.6 Guard

An if check alleviates the need for more advanced patterns. High priority.

Other then the listed above core features, there is a lot other groundwork remaining in order to have PM ready. These include decision on general syntax, introducer keyword, placeholder/wildcard definition, expression and statement details, exhaustiveness, refutability, etc.
Current proposal argues, what is presented here is just enough to make initial PM release both attractive and doable in the next 3 years. One major advantage of the above mix of features is that the two current alternative proposals do agree on their importance and even to some degree, on their syntax as well.
What they, the current alternatives, don’t agree on is how customization is done. This paper advocates no customization for the initial release, outside of what is already available in the language. In other words, PM will be just a syntax sugar over current customization points, without introducing new ones. For example, if a user has her/his class available for structured binding, it will work with destructuring inside PM. It must be stressed, customization points themselves are currently under investigation for improvement7, combined with the fact any customization in principle comes with high design time const, we should really be conservative on that front for now.

As you can see, most of what is proposed by the current alternatives is not included here, and that’s the point. Considering the listed above is fairly uncontroversial, not having to spend much time on “what” will give us more time for “how”, and there are plenty of open questions that need to be solved. We can always add stuff later.

  1. Pattern Matching for C++:↩︎

  2. Pattern Matching and Language Variants:↩︎

  3. Pattern Matching:↩︎

  4. Pattern matching using is and as:↩︎

  5. Open Pattern Matching for C++:↩︎

  6. Thoughts on pattern matching:↩︎

  7. We need a language mechanism for customization points:↩︎