C++ Logo

sg16

Advanced search

Re: [SG16] D1949R4 - Unicode Identifiers

From: Steve Downey <sdowney_at_[hidden]>
Date: Wed, 27 May 2020 18:06:01 -0400
I will treat this as feedback for R5 of the paper. Barring errors, I am
treating r4 as if it were submitted for mailing, and will do so before
posting to the EWG reflector.

On Wed, May 27, 2020, 17:43 Tom Honermann via SG16 <sg16_at_[hidden]>
wrote:

> Since we just polled a revision without updates to address Alisdair's
> feedback below, I'd like to request that any updates done based on this
> feedback go into a new revision. Let's treat these comments as if they are
> early EWG feedback.
>
> Tom.
>
> On 5/27/20 3:45 PM, Alisdair Meredith via SG16 wrote:
>
> More non-technical feedback, none of this should affect SG16.
>
> For 6.1, so not assume a wider audience in EWG will immediately
> know that ZWJ is a Zero Width Joiner. A clearer title or Captializing
> The Term may help? Possibly state what a zero width joiner is
> before noting that some scripts rely on them?
>
> (Yes, we can work this out easily enough, but I aspire to papers
> that are simple to read without back-tracking).
>
> 8 : after explaining normalization form C, conclude the example by
> explaining how À is represented in this form.
>
>
> Is it worth calling out in the ABI compatibility note that we wound be
> breaking compatibility between versions of the same compiler, in
> order to achieve compatibility between all compilers on that platform?
> It is inferred, but I think might be clearer.
>
> 9.2 R3 are these the patterns that the pattern matching papers will
> introduce a need for? Maybe add a ‘yet’ if so.
>
> AlisdairM
>
> On May 27, 2020, at 19:32, Steve Downey via SG16 <sg16_at_[hidden]>
> wrote:
>
> Updated per various reviewer comments.
> Diff:
>
> https://github.com/steve-downey/papers/commit/152d9e145e437a0f1377bd4850d83564f2feab6b#diff-a1a983c3f3fa85cc5db932bc8b0e7638
>
> On Wed, May 27, 2020 at 2:21 PM Steve Downey <sdowney_at_[hidden]> wrote:
>
>> Adopted the changes, except for the footnote, which corresponds to how
>> the LaTeX is marked up, with the \footnote inline in the text. The footnote
>> doesn't actually move, it's the rest of the text around it.
>>
>> On Wed, May 27, 2020 at 1:24 AM Tom Honermann <tom_at_[hidden]> wrote:
>>
>>> Thanks, Steve. A few nit-picky comments below.
>>>
>>> In the new "Summary" section, in addition to noting that emoji will no
>>> longer be allowed in identifiers, I think it would be helpful to note that
>>> identifiers previously allowed for some scripts will no longer be allowed.
>>> This is mentioned in section 6.1, but I think also worthy of mention in the
>>> summary.
>>>
>>> In section 7, there is an instance of "C++. C++.".
>>>
>>> Section 7 states that N3146 "considered using UAX31". My reading of
>>> N3146 is that it did use UAX #31, but it adapted what was then called the
>>> "Alternative Identifier Syntax" option. Unicode 9 renamed "Alternative
>>> Identifier Syntax" to "Immutable Identifiers". The relevant text from
>>> N3146 is:
>>>
>>> The set of UCNs *disallowed* in identifiers in C and C++ should exactly
>>> match the specification in [AltId], *with the following additions*: all
>>> characters in the Basic Latin (i.e. ASCII, basic source character) block,
>>> and all characters in the Unicode General Category "Separator, space".
>>>
>>> [AltId] corresponds to:
>>>
>>> Unicode Standard Annex #31: Unicode Identifier and Pattern Syntax,
>>> "Alternative Identifier Syntax",
>>> http://www.unicode.org/reports/tr31/tr31-11.html#Alternative_Identifier_Syntax
>>>
>>>
>>> Section 7 also states, "The Unicode standard has since made stability
>>> guarantees about identifiers, and created the XID_Start and XID_Continue
>>> properties to alleviate the stability concerns that existed in 2010."
>>> However, the Unicode 5.2 version of UAX #31 referenced by N3146 does
>>> reference XID_Start and XID_Continue. It looks to me like the XID
>>> properties have been around since at least 2005 and Unicode 4. Perhaps the
>>> XID properties were not stable at that time? Regardless, it looks like the
>>> quoted sentence needs an update.
>>>
>>> In section 9.3, the sub-sections are arguably out of order. The first
>>> two sub-sections are for R1 and R4 (requirements that are met), and the
>>> remaining sub-sections list requirements that are not met (including R1a,
>>> R1b, R2, and R3). I think the sub-section order should follow the
>>> requirement order (R1, R1a, R1b, R2, R3, R4, ...)
>>>
>>> In section 10, the end of the first paragraph appears to be missing an
>>> "XID"; "... character classes XID_Start and _Continue."
>>>
>>> In the wording for [lex.name]p1, the footnote is moved into the
>>> paragraph, but still states "footnote" instead of "note". If this is
>>> because Jens indicated this is how the editors expect relocation of a
>>> footnote to be communicated, then ignore this comment.
>>>
>>> In the wording for [lex.name]p1, the copied footnote text doesn't match
>>> the WP. There is a missing "\u in".
>>>
>>> In the annex wording for X.2 R1, can we avoid duplicating the grammar
>>> specification from [lex.name]?
>>>
>>> Tom.
>>>
>>> On 5/26/20 4:51 PM, Steve Downey via SG16 wrote:
>>>
>>> Find attached a draft of the UAX31 paper for discussion.
>>> Viewable at
>>> http://htmlpreview.github.io/?https://github.com/steve-downey/papers/blob/master/generated/p1949.html
>>> Source at https://github.com/steve-downey/papers/blob/master/p1949.md
>>>
>>> (note that github doesn't format the same way that mpark's WG21 format
>>> does)
>>>
>>>
>>> <p1949.html>--
> SG16 mailing list
> SG16_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>
>
>
>
> --
> SG16 mailing list
> SG16_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>

Received on 2020-05-27 17:09:18