C++ Logo

SG16

Advanced search

Subject: Re: Is the concept of basic execution character sets useful?
From: Steve Downey (sdowney_at_[hidden])
Date: 2021-01-27 10:50:21


The basic execution character set is the basic source execution character
set plus the mandatory control codes, and is what you need to express the
mandatory C characters in the execution space, without discussing what the
encoding actually is. When there was much wider variance in what character
sets included which characters it was far more important in figuring out
how to port the language.

On Wed, Jan 27, 2021 at 4:04 AM Corentin via SG16 <sg16_at_[hidden]>
wrote:

>
>
> On Wed, Jan 27, 2021 at 9:59 AM Peter Brett <pbrett_at_[hidden]> wrote:
>
>> Hi Corentin,
>>
>>
>>
>> This certainly seems like a possible simplification to me. Out of
>> interest, did you manage to find out **why** the concept of the basic
>> execution character set was added to the standard in the first place?
>>
>
> Alas I didn't, but it goes back to at least C89
>
>
>>
>>
>> Peter
>>
>>
>>
>> *From:* SG16 <sg16-bounces_at_[hidden]> *On Behalf Of *Corentin via
>> SG16
>> *Sent:* 27 January 2021 08:57
>> *To:* SG16 <sg16_at_[hidden]>
>> *Cc:* Corentin <corentin.jabot_at_[hidden]>
>> *Subject:* [SG16] Is the concept of basic execution character sets
>> useful?
>>
>>
>>
>> EXTERNAL MAIL
>>
>> Hello,
>>
>>
>>
>> Very quick reminder, using C++20 terminology
>>
>> We have:
>>
>>
>>
>> - basic source character set, which, while of limited use in the core
>> language is used quite a bit in the library as a proxy for "displayable
>> characters available in all encodings", which removal would then be
>> slightly more involved.
>>
>>
>>
>> - The execution character set(s) which describe actual character sets
>> used during evaluation and are therefore necessary.
>>
>>
>>
>> - The basic execution character set, which is a super set of the basic
>> source character set
>>
>> and a subset of all execution character sets.
>>
>>
>>
>> It's strictly basic source character set + alert + backspace + carriage
>> return + NULL
>>
>>
>>
>> Nowhere is it used in the library.
>>
>> It is not used in the core language either, except of course that we need
>> to prescribe that NULL is encoded as 0 and that digits are encoded
>> sequentially.
>>
>>
>>
>> While alert + backspace + carriage return are mentioned in escape
>> sequences, if a theoretical encoding would miss these characters, there
>> would be no further ill-effect on the behavior of the standard.
>>
>>
>>
>> The main change on top of the C++20 wording would be as follow
>>
>>
>>
>> The basic execution character set and the basic execution wide-character
>> set shall each contain all the members of the basic source character set, plus
>> control characters representing alert, backspace, and carriage return, plus
>> a null character (respectively, null wide character), whose value is 0. For
>> each basic execution character set, the values of the members shall be
>> non-negative and distinct from one another. In both the source and
>> execution basic character sets, the value of each character after 0 in the
>> above list of decimal digits shall be one greater than the value of the
>> previous. The execution character set and the execution wide-character
>> set are implementation-defined supersets of the basic execution character
>> set and the basic execution wide-character set, respectively. The values
>> of the members of the execution character sets and the sets of additional
>> members are locale-specific.
>>
>>
>>
>> Any reason why we should not do this?
>>
>>
>>
>> (As always, I'm interested in having a simple model with no
>> unnecessary terminology as, as observed these past few months, it has a
>> tendency to hinder our collective understanding)
>>
>>
>>
>> Corentin
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
> --
> SG16 mailing list
> SG16_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>



SG16 list run by sg16-owner@lists.isocpp.org