sg16: Re: [SG16] Agreeing with Corentin's point re: problem with strict use of abstract characters

From: Hubert Tong <hubert.reinterpretcast_at_[hidden]>
Date: Sun, 14 Jun 2020 17:12:22 -0400

On Sun, Jun 14, 2020 at 2:48 PM Corentin Jabot <corentinjabot_at_[hidden]>
wrote:

>
> On Sun, 14 Jun 2020 at 20:03, Hubert Tong <
> hubert.reinterpretcast_at_[hidden]> wrote:
>
>> On Sun, Jun 14, 2020 at 5:03 AM Corentin Jabot <corentinjabot_at_[hidden]>
>> wrote:
>>
>>>
>>> On Sun, 14 Jun 2020 at 08:59, Jens Maurer via SG16 <
>>> sg16_at_[hidden]> wrote:
>>>
>>> I don't think we should entertain any notion of "same character" in C++,
>>>> beyond value comparisons in the execution encoding and "identity" as
>>>> needed for "same identifier".
>>>>
>>>
>>> We need to in/before phase 1, but I think we reached the consensus that
>>> we otherwise
>>> shouldn't and wouldn't
>>>
>> To be clear, we need to make sure we are on the same page with respect to
>> the meta (notion of) notion of "same character":
>> By "character", do we mean an "abstract character" or a "coded character"?
>>
>
> abstract character in phase 1 ( to get rid of "abstract character" in
> phase 1, we would have to assume that we have encoded text already - I
> think that would be a reasonable assumption )
>
I think our definition of the members of the "basic source character set"
would still be in terms of abstract characters. The input, I believe, needs
to be considered encoded text in order to encapsulate all of the perceived
relevant differences between characters.

>
>
>>
>> I think that the relationships between terms represent an ideal that is
>> not met in practice. "Abstract character" is a meaningful notion; however,
>> the ideal that coded character sets are a bijective function between values
>> in a codespace and abstract characters has not been clearly attained.
>>
>
> Coded characters sets encode a set of abstract characters (unicode has
> non-characters) .
>
I believe the U+00C5/U+212B situation points out why we have a problem when
trying to handle abstract characters. At the lower technical leve, the
coded character set has them as different abstract characters. At a higher
level, they are considered the same. If we deal in coded characters, we
would not need to handle the "philosophical questions".

>
> Somer abstract characters do not exist in any coded character set. There
> are abstract characters not yet represented in computers that cannot be
> handled by a C++ implementation
>
With a "wetware" implementation, the formality of defining a coded
character set is not a requirement for the coded character set to "exist".

Received on 2020-06-14 16:15:49