Date: Mon, 9 Mar 2026 13:57:23 -0400
On Mon, Mar 9, 2026 at 5:23 AM Jan Schultke via Std-Proposals <
std-proposals_at_[hidden]> wrote:
> are you only talking about A-Z/a-z here? Or Spanish ene, German umlauts,
>> accented letters, etc.?
>>
> I'm only talking about ASCII letters. Umlauts are not in the basic
> character set anyway, so we can't provide any universal guarantee for those.
>
> For ASCII letters (with the same case), we can guarantee ordering. I think
> we can even guarantee that lowercase and uppercase letters are not
> interleaved, but in two disjoint ranges, and even guarantee that there is a
> constant offset between lowercase and uppercase letters. The only thing we
> cannot guarantee is full contiguity.
>
These days, one *could* just guarantee ASCII and be done with it.
As I read between your lines, what you're actually trying to do here is
accept *either* ASCII *or* EBCDIC (but don't go out of your way to accept
anything else).
FWIW, I just checked, and your rules would even successfully accept Knuth's
original MIX character encoding
<https://archive.org/details/fundamentalalgor0001unse/page/136/mode/2up>,
which assigns A=1, B=2, etc. according to this function:
char chr(int i) { return "
ABCDEFGHIΘJKLMNOPQRΦ∏STUVWXYZ0123456789.,()+-*/=$<>@;:'"[i]; }
(But that encoding has no lowercase, so it's not really relevant to C++
anyway.)
–Arthur
std-proposals_at_[hidden]> wrote:
> are you only talking about A-Z/a-z here? Or Spanish ene, German umlauts,
>> accented letters, etc.?
>>
> I'm only talking about ASCII letters. Umlauts are not in the basic
> character set anyway, so we can't provide any universal guarantee for those.
>
> For ASCII letters (with the same case), we can guarantee ordering. I think
> we can even guarantee that lowercase and uppercase letters are not
> interleaved, but in two disjoint ranges, and even guarantee that there is a
> constant offset between lowercase and uppercase letters. The only thing we
> cannot guarantee is full contiguity.
>
These days, one *could* just guarantee ASCII and be done with it.
As I read between your lines, what you're actually trying to do here is
accept *either* ASCII *or* EBCDIC (but don't go out of your way to accept
anything else).
FWIW, I just checked, and your rules would even successfully accept Knuth's
original MIX character encoding
<https://archive.org/details/fundamentalalgor0001unse/page/136/mode/2up>,
which assigns A=1, B=2, etc. according to this function:
char chr(int i) { return "
ABCDEFGHIΘJKLMNOPQRΦ∏STUVWXYZ0123456789.,()+-*/=$<>@;:'"[i]; }
(But that encoding has no lowercase, so it's not really relevant to C++
anyway.)
–Arthur
Received on 2026-03-09 17:57:37
