C++ Logo

sg16

Advanced search

Re: [SG16] Is the concept of basic execution character sets useful?

From: Corentin <corentin.jabot_at_[hidden]>
Date: Wed, 27 Jan 2021 17:53:58 +0100
On Wed, Jan 27, 2021 at 5:50 PM Steve Downey <sdowney_at_[hidden]> wrote:

> The basic execution character set is the basic source execution character
> set plus the mandatory control codes, and is what you need to express the
> mandatory C characters in the execution space, without discussing what the
> encoding actually is. When there was much wider variance in what character
> sets included which characters it was far more important in figuring out
> how to port the language.
>

If there was a theoretical encoding that had no bell for example, would
that break C++ in anyway?


>
> On Wed, Jan 27, 2021 at 4:04 AM Corentin via SG16 <sg16_at_[hidden]>
> wrote:
>
>>
>>
>> On Wed, Jan 27, 2021 at 9:59 AM Peter Brett <pbrett_at_[hidden]> wrote:
>>
>>> Hi Corentin,
>>>
>>>
>>>
>>> This certainly seems like a possible simplification to me. Out of
>>> interest, did you manage to find out **why** the concept of the basic
>>> execution character set was added to the standard in the first place?
>>>
>>
>> Alas I didn't, but it goes back to at least C89
>>
>>
>>>
>>>
>>> Peter
>>>
>>>
>>>
>>> *From:* SG16 <sg16-bounces_at_[hidden]> *On Behalf Of *Corentin
>>> via SG16
>>> *Sent:* 27 January 2021 08:57
>>> *To:* SG16 <sg16_at_[hidden]>
>>> *Cc:* Corentin <corentin.jabot_at_[hidden]>
>>> *Subject:* [SG16] Is the concept of basic execution character sets
>>> useful?
>>>
>>>
>>>
>>> EXTERNAL MAIL
>>>
>>> Hello,
>>>
>>>
>>>
>>> Very quick reminder, using C++20 terminology
>>>
>>> We have:
>>>
>>>
>>>
>>> - basic source character set, which, while of limited use in the core
>>> language is used quite a bit in the library as a proxy for "displayable
>>> characters available in all encodings", which removal would then be
>>> slightly more involved.
>>>
>>>
>>>
>>> - The execution character set(s) which describe actual character sets
>>> used during evaluation and are therefore necessary.
>>>
>>>
>>>
>>> - The basic execution character set, which is a super set of the basic
>>> source character set
>>>
>>> and a subset of all execution character sets.
>>>
>>>
>>>
>>> It's strictly basic source character set + alert + backspace +
>>> carriage return + NULL
>>>
>>>
>>>
>>> Nowhere is it used in the library.
>>>
>>> It is not used in the core language either, except of course that we
>>> need to prescribe that NULL is encoded as 0 and that digits are encoded
>>> sequentially.
>>>
>>>
>>>
>>> While alert + backspace + carriage return are mentioned in escape
>>> sequences, if a theoretical encoding would miss these characters, there
>>> would be no further ill-effect on the behavior of the standard.
>>>
>>>
>>>
>>> The main change on top of the C++20 wording would be as follow
>>>
>>>
>>>
>>> The basic execution character set and the basic execution
>>> wide-character set shall each contain all the members of the basic source
>>> character set, plus control characters representing alert, backspace,
>>> and carriage return, plus a null character (respectively, null wide
>>> character), whose value is 0. For each basic execution character set,
>>> the values of the members shall be non-negative and distinct from one
>>> another. In both the source and execution basic character sets, the value
>>> of each character after 0 in the above list of decimal digits shall be one
>>> greater than the value of the previous. The execution character set and
>>> the execution wide-character set are implementation-defined supersets of
>>> the basic execution character set and the basic execution wide-character
>>> set, respectively. The values of the members of the execution character
>>> sets and the sets of additional members are locale-specific.
>>>
>>>
>>>
>>> Any reason why we should not do this?
>>>
>>>
>>>
>>> (As always, I'm interested in having a simple model with no
>>> unnecessary terminology as, as observed these past few months, it has a
>>> tendency to hinder our collective understanding)
>>>
>>>
>>>
>>> Corentin
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>> --
>> SG16 mailing list
>> SG16_at_[hidden]
>> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>>
>

Received on 2021-01-27 10:54:11