C++ Logo

sg16

Advanced search

Re: [SG16] Is the concept of basic execution character sets useful?

From: Corentin <corentin.jabot_at_[hidden]>
Date: Wed, 27 Jan 2021 10:04:22 +0100
On Wed, Jan 27, 2021 at 9:59 AM Peter Brett <pbrett_at_[hidden]> wrote:

> Hi Corentin,
>
>
>
> This certainly seems like a possible simplification to me. Out of
> interest, did you manage to find out **why** the concept of the basic
> execution character set was added to the standard in the first place?
>

Alas I didn't, but it goes back to at least C89


>
>
> Peter
>
>
>
> *From:* SG16 <sg16-bounces_at_[hidden]> *On Behalf Of *Corentin via
> SG16
> *Sent:* 27 January 2021 08:57
> *To:* SG16 <sg16_at_[hidden]>
> *Cc:* Corentin <corentin.jabot_at_[hidden]>
> *Subject:* [SG16] Is the concept of basic execution character sets useful?
>
>
>
> EXTERNAL MAIL
>
> Hello,
>
>
>
> Very quick reminder, using C++20 terminology
>
> We have:
>
>
>
> - basic source character set, which, while of limited use in the core
> language is used quite a bit in the library as a proxy for "displayable
> characters available in all encodings", which removal would then be
> slightly more involved.
>
>
>
> - The execution character set(s) which describe actual character sets used
> during evaluation and are therefore necessary.
>
>
>
> - The basic execution character set, which is a super set of the basic
> source character set
>
> and a subset of all execution character sets.
>
>
>
> It's strictly basic source character set + alert + backspace + carriage
> return + NULL
>
>
>
> Nowhere is it used in the library.
>
> It is not used in the core language either, except of course that we need
> to prescribe that NULL is encoded as 0 and that digits are encoded
> sequentially.
>
>
>
> While alert + backspace + carriage return are mentioned in escape
> sequences, if a theoretical encoding would miss these characters, there
> would be no further ill-effect on the behavior of the standard.
>
>
>
> The main change on top of the C++20 wording would be as follow
>
>
>
> The basic execution character set and the basic execution wide-character
> set shall each contain all the members of the basic source character set, plus
> control characters representing alert, backspace, and carriage return, plus
> a null character (respectively, null wide character), whose value is 0. For
> each basic execution character set, the values of the members shall be
> non-negative and distinct from one another. In both the source and
> execution basic character sets, the value of each character after 0 in the
> above list of decimal digits shall be one greater than the value of the
> previous. The execution character set and the execution wide-character
> set are implementation-defined supersets of the basic execution character
> set and the basic execution wide-character set, respectively. The values
> of the members of the execution character sets and the sets of additional
> members are locale-specific.
>
>
>
> Any reason why we should not do this?
>
>
>
> (As always, I'm interested in having a simple model with no
> unnecessary terminology as, as observed these past few months, it has a
> tendency to hinder our collective understanding)
>
>
>
> Corentin
>
>
>
>
>
>
>
>
>
>
>

Received on 2021-01-27 03:04:37