ISOCPP std-proposals List: Re: [std-proposals] constexpr tolower, toupper, isalpha

From: Jason McKesson <jmckesson_at_[hidden]>
Date: Tue, 8 Jul 2025 23:33:26 -0400

On Tue, Jul 8, 2025 at 10:07 PM Thiago Macieira via Std-Proposals
<std-proposals_at_[hidden]> wrote:
> On Tuesday, 8 July 2025 17:25:48 Pacific Daylight Time JJ Marr via Std-
> Proposals wrote:
> > > Yeah, but... Why do you want it?
> >
> > I want the ability to do case-insensitive comparisons for the majority of
> > human languages and letters without having to think too much about it, get
> > approval to bring in an external library, or deal with
> > implementation-defined behaviour.
>
> But if we put you and everyone else who may use the Standard API together,
> will the subset be any different than full Unicode? And why can't you just use
> the de facto Unicode library, ICU?
>
> The only reason I can think of (aside from company bureaucracy) is constexpr
> support. An implementation using intrinsics can be much faster because the
> compiler is probably already carrying such a database anyway, what with named
> unicode codepoints and other transformations it needs, and won't need to have
> interpreted constexpr execution.
>
> And yet, why is this needed at constexpr time? We do not need to replace all
> codegen tools. Write your own.
>
> > Part of the reason there's so many diverging implementations of "case
> > insensitivity" is the lack of standardization. Here, the standard exists
> > (Unicode). It is freely accessible, portable, and handles all of the edge
> > cases we are discussing. We just need to implement it.
>
> Agreed. And it's implemented in ICU. Ergo...

So what else from Unicode should C++ not support because ICU supports
it? Should we just not have UTF encoding conversions? No normalization
rules? How much of Unicode should C++ not support because ICU does?

C++ should support Unicode, and part of that support should be support
for case folding, and part of Unicode case folding includes simple
case folding. Considering that what we're talking about involves no
locale messiness and no language-specific rules, I don't see why C++
Unicode support would exclude it.

Received on 2025-07-09 03:33:39