Date: Fri, 23 May 2025 23:53:21 +0000
On Friday, May 23rd, 2025 at 3:04 PM, Sebastian Wittmeier via Std-Proposals <std-proposals_at_[hidden]> wrote:
> The question is, what is really needed or missing?
>
> - a general switch case for non-integral or custom data types:
>
> will be provided by pattern matching
I think what missing is the capability to benefit
more from compiler-generated jump tables.
A number of posts in this thread imply that
"a `switch` on non-integers is not a jump table,"
which is totally not true. `switch` body is always
the table to jump into[1], with the methods to
preprocess the input condition varies, and
compiler already do all kinds of preprocessing.
I would like to understand `switch...case`
as a "table-based control flow." There're two
implications:
1. Lexically sharing the flow means
lexically sharing the control.
2. Single entry for any input condition.
Pattern matching doesn't check either of
these. At a semantic level, patterns are tried
sequentially in from top to bottom, and users
must keep this in mind to write correct code.
[1] Except when branch prediction is deemed
more efficient, in which case saying "`switch` on
integers doesn't have to emit a jump table"
would be more accurate.
> - a solution specifically for strings, making them first-class citizens and enabling better compiler optimizations?
That's something to be explored. What I know
is that, if we want to do anything extending
`switch`, making string literals work out of the
box is the bottom line, and it's okay if the
actual implementations targeting strings are
more "magical" than what the generalization
implies. But that was information >10 years
ago, certainly predates the pattern matching
discussions.
Standing today, I guess C++20 structural type
may be worth of considering.
> Is it clear, what codepage is string has? Is it clear, what equality means? Sometimes there are different Unicode ways to express the same letters or word, e.g. explicitly putting diacritics on letters or choosing a letter with diacritics.
>
> - a specific solution for strings with fixed number of options and perfect hashing and Tries?
>
> - an grammar for parsing included directly into the language?
>
> like in Raku:
>
> grammar Enhanced-Paragraph { token TOP { <superword>[ (\s+) <superword>]+ } token superword { <word> | <enhanced-word> } token word { \w+ } token enhanced-word { \* <word> \* } } my $paragraph = "þor is *mighty*"; my $parsed = Enhanced-Paragraph.parse($paragraph); say $parsed; (copied from https://dev.to/jj/introduction-to-grammars-with-perl6-75e)
> The question is, what is really needed or missing?
>
> - a general switch case for non-integral or custom data types:
>
> will be provided by pattern matching
I think what missing is the capability to benefit
more from compiler-generated jump tables.
A number of posts in this thread imply that
"a `switch` on non-integers is not a jump table,"
which is totally not true. `switch` body is always
the table to jump into[1], with the methods to
preprocess the input condition varies, and
compiler already do all kinds of preprocessing.
I would like to understand `switch...case`
as a "table-based control flow." There're two
implications:
1. Lexically sharing the flow means
lexically sharing the control.
2. Single entry for any input condition.
Pattern matching doesn't check either of
these. At a semantic level, patterns are tried
sequentially in from top to bottom, and users
must keep this in mind to write correct code.
[1] Except when branch prediction is deemed
more efficient, in which case saying "`switch` on
integers doesn't have to emit a jump table"
would be more accurate.
> - a solution specifically for strings, making them first-class citizens and enabling better compiler optimizations?
That's something to be explored. What I know
is that, if we want to do anything extending
`switch`, making string literals work out of the
box is the bottom line, and it's okay if the
actual implementations targeting strings are
more "magical" than what the generalization
implies. But that was information >10 years
ago, certainly predates the pattern matching
discussions.
Standing today, I guess C++20 structural type
may be worth of considering.
> Is it clear, what codepage is string has? Is it clear, what equality means? Sometimes there are different Unicode ways to express the same letters or word, e.g. explicitly putting diacritics on letters or choosing a letter with diacritics.
>
> - a specific solution for strings with fixed number of options and perfect hashing and Tries?
>
> - an grammar for parsing included directly into the language?
>
> like in Raku:
>
> grammar Enhanced-Paragraph { token TOP { <superword>[ (\s+) <superword>]+ } token superword { <word> | <enhanced-word> } token word { \w+ } token enhanced-word { \* <word> \* } } my $paragraph = "þor is *mighty*"; my $parsed = Enhanced-Paragraph.parse($paragraph); say $parsed; (copied from https://dev.to/jj/introduction-to-grammars-with-perl6-75e)
Received on 2025-05-23 23:53:29