Date: Tue, 08 Jul 2025 19:07:07 -0700
On Tuesday, 8 July 2025 17:25:48 Pacific Daylight Time JJ Marr via Std-
Proposals wrote:
> > Yeah, but... Why do you want it?
>
> I want the ability to do case-insensitive comparisons for the majority of
> human languages and letters without having to think too much about it, get
> approval to bring in an external library, or deal with
> implementation-defined behaviour.
But if we put you and everyone else who may use the Standard API together,
will the subset be any different than full Unicode? And why can't you just use
the de facto Unicode library, ICU?
The only reason I can think of (aside from company bureaucracy) is constexpr
support. An implementation using intrinsics can be much faster because the
compiler is probably already carrying such a database anyway, what with named
unicode codepoints and other transformations it needs, and won't need to have
interpreted constexpr execution.
And yet, why is this needed at constexpr time? We do not need to replace all
codegen tools. Write your own.
> Part of the reason there's so many diverging implementations of "case
> insensitivity" is the lack of standardization. Here, the standard exists
> (Unicode). It is freely accessible, portable, and handles all of the edge
> cases we are discussing. We just need to implement it.
Agreed. And it's implemented in ICU. Ergo...
Proposals wrote:
> > Yeah, but... Why do you want it?
>
> I want the ability to do case-insensitive comparisons for the majority of
> human languages and letters without having to think too much about it, get
> approval to bring in an external library, or deal with
> implementation-defined behaviour.
But if we put you and everyone else who may use the Standard API together,
will the subset be any different than full Unicode? And why can't you just use
the de facto Unicode library, ICU?
The only reason I can think of (aside from company bureaucracy) is constexpr
support. An implementation using intrinsics can be much faster because the
compiler is probably already carrying such a database anyway, what with named
unicode codepoints and other transformations it needs, and won't need to have
interpreted constexpr execution.
And yet, why is this needed at constexpr time? We do not need to replace all
codegen tools. Write your own.
> Part of the reason there's so many diverging implementations of "case
> insensitivity" is the lack of standardization. Here, the standard exists
> (Unicode). It is freely accessible, portable, and handles all of the edge
> cases we are discussing. We just need to implement it.
Agreed. And it's implemented in ICU. Ergo...
-- Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org Principal Engineer - Intel Platform & System Engineering
Received on 2025-07-09 02:07:14