C++ Logo

sg16

Advanced search

Re: Pattern Syntax and Whitespace

From: Hubert Tong <hubert.reinterpretcast_at_[hidden]>
Date: Wed, 14 Sep 2022 23:47:56 -0400
On Wed, Sep 14, 2022 at 10:48 PM Steve Downey <sdowney_at_[hidden]> wrote:

>
>
> On Wed, Sep 14, 2022, 22:18 Hubert Tong <hubert.reinterpretcast_at_[hidden]>
> wrote:
>
>> On Wed, Sep 14, 2022 at 9:19 PM Steve Downey <sdowney_at_[hidden]> wrote:
>>
>>> We said it doesn't apply for the wrong reasons. It still doesn't apply,
>>> but we should explain why.
>>>
>>
>> You mean the intent is to state that we don't meet UAX-R3 at all?
>> If the intent is instead to state that we meet UAX-R3-2, then I am not
>> sure that there is enough clarity over what is meant by "syntactic use".
>>
>
> I think we might eventually harmonize them.
> My intent right now is to disclaim conformance, as we already do, and
> describe what we do, with references back to normative text.
>
> If describing what we use for syntax is controversial, in the sense that
> UAX31 describes, or we don't agree on what it means, saying we're not
> confirming and see relevant parts of [lex] without detail would be entirely
> acceptable to me.
>
> We've done the right thing, for the wrong reason, and I'd like to replace
> the reason, without claiming anything more.
>

Sounds good.


>
>
>
>>
>>>
>>> On Wed, Sep 14, 2022, 21:17 Steve Downey <sdowney_at_[hidden]> wrote:
>>>
>>>> Hopefully I'll be clearer in the draft. Our whitespace definition is
>>>> entirely disjoint from identifier characters, but a subset of the Unicode
>>>> whitespace characters. Our syntax characters are limited to the basic
>>>> source characters, and are also a subset of what Unicode allows.
>>>> My intention is to document what we do, using Unicode notation, but
>>>> make no actual changes to lexing or parsing.
>>>>
>>>> In some future standard, extending whitespace ought to be reasonably
>>>> straightforward.
>>>> Extending the set we use for syntax would be fraught.
>>>>
>>>> On Wed, Sep 14, 2022, 21:05 Hubert Tong <
>>>> hubert.reinterpretcast_at_[hidden]> wrote:
>>>>
>>>>> On Wed, Sep 14, 2022 at 7:11 PM Steve Downey via SG16 <
>>>>> sg16_at_[hidden]> wrote:
>>>>>
>>>>>> The new Unicode 15.0 version of UAX 31 clarifies
>>>>>> https://unicode.org/reports/tr31/#Pattern_Syntax that the
>>>>>> definitions of whitespace and characters used for syntax are intended to
>>>>>> apply to programming languages. C++ does not use them, of course, nor am I
>>>>>> suggesting we should make this change now.
>>>>>>
>>>>>> For purposes of the conformance annex, E,
>>>>>> http://eel.is/c++draft/uaxid#pattern, I am thinking of extracting
>>>>>> the profile we do use from our lexer defs, and refer back to those sections
>>>>>> as being normative. Does that seem reasonable?
>>>>>>
>>>>>
>>>>> UAX-R3 claims:
>>>>> When meeting this requirement, all characters except those that have
>>>>> the Pattern_White_Space or Pattern_Syntax properties are available for use
>>>>> as identifiers or literals.
>>>>>
>>>>> Can you explain how that is true (especially for C++ identifiers) with
>>>>> a profile?
>>>>>
>>>>> Also, is it clear that our identifier grammar confers syntactic
>>>>> relevance to, for example, XID_Start characters for this profile?
>>>>>
>>>>>
>>>>>>
>>>>>> They've also clarified in a few places where the intent has always
>>>>>> been to choose from one of the alternatives, where for example Restricted
>>>>>> Format Characters R1a is ruled out if you're following R1,
>>>>>> https://unicode.org/reports/tr31/#R1 and
>>>>>> https://unicode.org/reports/tr31/#R1a , because the characters are
>>>>>> excluded from XID_Continue.
>>>>>>
>>>>>> My plan is to have a draft early next week, and it's currently on our
>>>>>> NB comment list, with a note to have changes to refer to.
>>>>>>
>>>>>> I'll also add the updates from 14 to 15 as comments. Comments are due
>>>>>> for us to INCITS on the 28th, so internal discussion is leaning towards
>>>>>> getting in anything we think should be considered, and better two comments
>>>>>> than none.
>>>>>> --
>>>>>> SG16 mailing list
>>>>>> SG16_at_[hidden]
>>>>>> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>>>>>>
>>>>>

Received on 2022-09-15 03:48:24