C++ Logo


Advanced search

Re: [SG16] D1949R4 - Unicode Identifiers

From: Tom Honermann <tom_at_[hidden]>
Date: Wed, 27 May 2020 17:42:38 -0400
Since we just polled a revision without updates to address Alisdair's
feedback below, I'd like to request that any updates done based on this
feedback go into a new revision. Let's treat these comments as if they
are early EWG feedback.


On 5/27/20 3:45 PM, Alisdair Meredith via SG16 wrote:
> More non-technical feedback, none of this should affect SG16.
> For 6.1, so not assume a wider audience in EWG will immediately
> know that ZWJ is a Zero Width Joiner. A clearer title or Captializing
> The Term may help? Possibly state what a zero width joiner is
> before noting that some scripts rely on them?
> (Yes, we can work this out easily enough, but I aspire to papers
> that are simple to read without back-tracking).
> 8 : after explaining normalization form C, conclude the example by
> explaining how À is represented in this form.
> Is it worth calling out in the ABI compatibility note that we wound be
> breaking compatibility between versions of the same compiler, in
> order to achieve compatibility between all compilers on that platform?
> It is inferred, but I think might be clearer.
> 9.2 R3 are these the patterns that the pattern matching papers will
> introduce a need for? Maybe add a ‘yet’ if so.
> AlisdairM
>> On May 27, 2020, at 19:32, Steve Downey via SG16
>> <sg16_at_[hidden] <mailto:sg16_at_[hidden]>> wrote:
>> Updated per various reviewer comments.
>> Diff:
>> https://github.com/steve-downey/papers/commit/152d9e145e437a0f1377bd4850d83564f2feab6b#diff-a1a983c3f3fa85cc5db932bc8b0e7638
>> On Wed, May 27, 2020 at 2:21 PM Steve Downey <sdowney_at_[hidden]
>> <mailto:sdowney_at_[hidden]>> wrote:
>> Adopted the changes, except for the footnote, which corresponds
>> to how the LaTeX is marked up, with the \footnote inline in the
>> text. The footnote doesn't actually move, it's the rest of
>> the text around it.
>> On Wed, May 27, 2020 at 1:24 AM Tom Honermann <tom_at_[hidden]
>> <mailto:tom_at_[hidden]>> wrote:
>> Thanks, Steve. A few nit-picky comments below.
>> In the new "Summary" section, in addition to noting that
>> emoji will no longer be allowed in identifiers, I think it
>> would be helpful to note that identifiers previously allowed
>> for some scripts will no longer be allowed. This is
>> mentioned in section 6.1, but I think also worthy of mention
>> in the summary.
>> In section 7, there is an instance of "C++. C++.".
>> Section 7 states that N3146 "considered using UAX31". My
>> reading of N3146 is that it did use UAX #31, but it adapted
>> what was then called the "Alternative Identifier Syntax"
>> option. Unicode 9 renamed "Alternative Identifier Syntax" to
>> "Immutable Identifiers". The relevant text from N3146 is:
>>> The set of UCNs*disallowed*in identifiers in C and C++
>>> should exactly match the specification in [AltId],*with the
>>> following additions*: all characters in the Basic Latin
>>> (i.e. ASCII, basic source character) block, and all
>>> characters in the Unicode General Category "Separator, space".
>> [AltId] corresponds to:
>>> Unicode Standard Annex #31: Unicode Identifier and
>>> Pattern Syntax, "Alternative Identifier
>>> Syntax",http://www.unicode.org/reports/tr31/tr31-11.html#Alternative_Identifier_Syntax
>> Section 7 also states, "The Unicode standard has since made
>> stability guarantees about identifiers, and created the
>> XID_Start and XID_Continue properties to alleviate the
>> stability concerns that existed in 2010." However, the
>> Unicode 5.2 version of UAX #31 referenced by N3146 does
>> reference XID_Start and XID_Continue. It looks to me like
>> the XID properties have been around since at least 2005 and
>> Unicode 4. Perhaps the XID properties were not stable at
>> that time? Regardless, it looks like the quoted sentence
>> needs an update.
>> In section 9.3, the sub-sections are arguably out of order.
>> The first two sub-sections are for R1 and R4 (requirements
>> that are met), and the remaining sub-sections list
>> requirements that are not met (including R1a, R1b, R2, and
>> R3). I think the sub-section order should follow the
>> requirement order (R1, R1a, R1b, R2, R3, R4, ...)
>> In section 10, the end of the first paragraph appears to be
>> missing an "XID"; "... character classes XID_Start and
>> _Continue."
>> In the wording for [lex.name <http://lex.name/>]p1, the
>> footnote is moved into the paragraph, but still states
>> "footnote" instead of "note". If this is because Jens
>> indicated this is how the editors expect relocation of a
>> footnote to be communicated, then ignore this comment.
>> In the wording for [lex.name <http://lex.name/>]p1, the
>> copied footnote text doesn't match the WP. There is a
>> missing "\u in".
>> In the annex wording for X.2 R1, can we avoid duplicating the
>> grammar specification from [lex.name <http://lex.name/>]?
>> Tom.
>> On 5/26/20 4:51 PM, Steve Downey via SG16 wrote:
>>> Find attached a draft of the UAX31 paper for discussion.
>>> Viewable at
>>> http://htmlpreview.github.io/?https://github.com/steve-downey/papers/blob/master/generated/p1949.html
>>> Source at
>>> https://github.com/steve-downey/papers/blob/master/p1949.md
>>> (note that github doesn't format the same way that mpark's
>>> WG21 format does)
>> <p1949.html>--
>> SG16 mailing list
>> SG16_at_[hidden] <mailto:SG16_at_[hidden]>
>> https://lists.isocpp.org/mailman/listinfo.cgi/sg16

Received on 2020-05-27 16:45:50