sg16: Re: [SG16] P2194R0 The character set of C++ source code is Unicode

From: Tom Honermann <tom_at_[hidden]>
Date: Wed, 9 Sep 2020 09:37:12 -0400

On 9/8/20 10:09 AM, Peter Brett via SG16 wrote:
> Hi all,
>
> If you have already formulated any comments or suggestions with regard to this
> paper, I'd really appreciate that you share them now so that I have the chance
> to think about them in advance of tomorrow's SG16 meeting.

If there is consensus to move in this direction, then we should discuss
wording strategy. As is, the proposed wording creates a hole in the
standard by removing the formation of implicit
/universal-character-name/s. Grammar terms like identifier-nondigit will
need updates to recognize non-/universal-character-name/s outside the
basic source character set. identifier-non-digit is currently defined as:

    identifier-nondigit:
         nondigit
         universal-character-name

    nondigit: one of
         a b c d e f g h i j k l m
         n o p q r s t u v w x y z
         A B C D E F G H I J K L M
         N O P Q R S T U V W X Y Z _

Similar updates will be needed for /c-char/ and /s-char/, and possibly
for /r-char/, /h-char/, and /q-char/. An audit of
/universal-character-name/ would likely reveal where additional updates
would be needed.

Tom.

>
> If you plan to take part in the meeting tomorrow then please take the time to
> read the paper.
>
> Best regards,
>
> Peter
>
>> -----Original Message-----
>> From: SG16 <sg16-bounces_at_[hidden]> On Behalf Of Peter Brett via SG16
>> Sent: 24 August 2020 13:31
>> To: sg16_at_[hidden]
>> Cc: Peter Brett <pbrett_at_[hidden]>; Corentin <corentin.jabot_at_[hidden]>
>> Subject: [SG16] P2194R0 The character set of C++ source code is Unicode
>>
>> EXTERNAL MAIL
>>
>>
>> Hi all,
>>
>> In this week's meeting, we are going to discuss the remaining
>> proposals from P2178R1 "Misc lexing and string handling improvements".
>> In particular, we will discuss proposal 9:
>>
>> Proposal 9: Reaffirming Unicode as the character set of the
>> internal representation
>>
>> In anticipation of a lively discussion, Corentin and I have written a
>> short new paper which will be appearing in the September mailing.
>>
>> P2194R0 The character set of C++ source code is Unicode
>>
>> https://urldefense.com/v3/__https://isocpp.org/files/papers/P2194R0.pdf__;!!
>> EHscmS1ygiU1lA!ULNuXSshcy2bqJQlDmDQZRKtFSpvQ-
>> GehADzc79HdKlpXVZRoPED6Iw_Dca1Jg$
>>
>> We hope that the study group finds this contribution helpful and
>> informative.
>>
>> Best regards,
>>
>> Peter
>>
>> --
>> SG16 mailing list
>> SG16_at_[hidden]
>> https://urldefense.com/v3/__https://lists.isocpp.org/mailman/listinfo.cgi/sg
>> 16__;!!EHscmS1ygiU1lA!ULNuXSshcy2bqJQlDmDQZRKtFSpvQ-
>> GehADzc79HdKlpXVZRoPED6Iw44qhWfw$

Received on 2020-09-09 08:40:43