C++ Logo

sg16

Advanced search

[SG16] Is the concept of basic execution character sets useful?

From: Corentin <corentin.jabot_at_[hidden]>
Date: Wed, 27 Jan 2021 09:56:46 +0100
Hello,

Very quick reminder, using C++20 terminology
We have:

- basic source character set, which, while of limited use in the core
language is used quite a bit in the library as a proxy for "displayable
characters available in all encodings", which removal would then be
slightly more involved.

- The execution character set(s) which describe actual character sets used
during evaluation and are therefore necessary.

- The basic execution character set, which is a super set of the basic
source character set
and a subset of all execution character sets.

It's strictly basic source character set + alert + backspace + carriage
return + NULL

Nowhere is it used in the library.
It is not used in the core language either, except of course that we need
to prescribe that NULL is encoded as 0 and that digits are encoded
sequentially.

While alert + backspace + carriage return are mentioned in escape
sequences, if a theoretical encoding would miss these characters, there
would be no further ill-effect on the behavior of the standard.

The main change on top of the C++20 wording would be as follow

The basic execution character set and the basic execution wide-character
set shall each contain all the members of the basic source character set, plus
control characters representing alert, backspace, and carriage return, plus
a null character (respectively, null wide character), whose value is 0. For
each basic execution character set, the values of the members shall be
non-negative and distinct from one another. In both the source and
execution basic character sets, the value of each character after 0 in the
above list of decimal digits shall be one greater than the value of the
previous. The execution character set and the execution wide-character
set are implementation-defined supersets of the basic execution character
set and the basic execution wide-character set, respectively. The values of
the members of the execution character sets and the sets of additional
members are locale-specific.

Any reason why we should not do this?

(As always, I'm interested in having a simple model with no
unnecessary terminology as, as observed these past few months, it has a
tendency to hinder our collective understanding)

Corentin

Received on 2021-01-27 02:57:00