Subject: Re: [isocpp-ext] Virtual evolution meeting on Unicode identifiers, Thursday June 18th @ 10AM Pacific
From: Alisdair Meredith (alisdairm_at_[hidden])
Date: 2020-06-18 08:35:19
Late question looking over the paper.
How does the restriction of identifiers to follow the Unicode
specification compare to the C standard Annex D?
I ask, as I am wondering if we want some more wording
for C++ Annex C.5 [diff.iso] on which identifiers may (or may
not) be used in shared code.
> On Jun 8, 2020, at 19:14, JF Bastien via Ext <ext_at_[hidden]> wrote:
> Hello âºâ¥âââ¤â£âââ,
> Next week on Thursday the 18th at 10AM Pacific we'll be discussing Unicode identifiers. It was on our "tentatively ready" list as of Prague, but received some feedback and has been updated as detailed by Steve below. I'd like us to discuss the changes, and tentatively leave it on the tentatively ready list, so next time we can make decisions we reaffirm that we're forwarding to Core (as per our process <http://wg21.link/p1999>).
> Updated paper:
> https://isocpp.org/files/papers/P1949R4.html <https://isocpp.org/files/papers/P1949R4.html>
> Here's the GitHub issue:
> https://github.com/cplusplus/papers/issues/688 <https://github.com/cplusplus/papers/issues/688>
> Meeting information:
> Zoom Meeting ID 735059607
> Zoom Meeting Password template
> Zoom Meeting Room https://iso.zoom.us/j/735059607?pwd=d2tzRkZrTGY1c241R2prOVIrVnNXdz09 <https://iso.zoom.us/j/735059607?pwd=d2tzRkZrTGY1c241R2prOVIrVnNXdz09>
> Zoom Meeting Automatic Phone-In US: +16699006833,,735059607# or +14086380968,,735059607#
> Zoom Meeting Phone Number US: +1 669 900 6833
> International numbers available https://iso.zoom.us/u/acPWjSNM0
> See you then!
> p.s. this is valid C++:
> int âºâ¥âââ¤â£âââ;
> On Fri, Jun 5, 2020 at 1:37 PM Steve Downey via SG16 <sg16_at_[hidden] <mailto:sg16_at_[hidden]>> wrote:
> Last week SG16 (Text) approved forwarding this paper to EWG for consideration. It addresses fixing the state of allowed identifiers in C++.
> https://isocpp.org/files/papers/P1949R4.html <https://isocpp.org/files/papers/P1949R4.html> (also attached as d1949.html)
> Summary <https://isocpp.org/files/papers/D1949R4.html#summary>
> The allowed Unicode code points in identifiers include many that are unassigned or unnecessary, and others that are actually counter-productive. By adopting the recommendations of UAX #31, Unicode Identifier and Pattern Syntax, C++ will be easier to work with in international environments and less prone to accidental problems.
> This proposal does not address some potential security concernsâso called homoglyph attacksâwhere letters that appear the same may be treated as distinct. Methods of defense against such attacks are complex and evolving, and requiring mitigation strategies would impose substantial implementation burden.
> This proposal also recommends adoption of Unicode normalization form C (NFC) for identifiers to ensure that when compared, identifiers intended to be the same will compare as equal. Legacy encodings are generally naturally in NFC when converted to Unicode. Most tools will, by default, produce NFC text.
> Some unusual scripts require the use of characters as joiners that are not allowed by UAX #31, these will no longer be available as identifiers in C++.
> As a side-effect of adopting the identifier characters from UAX #31, using emoji in or as identifiers becomes ill-formed.
> See also
> https://unicode.org/reports/tr31/ <https://unicode.org/reports/tr31/> UnicodeÂ® Standard Annex #31 UNICODE IDENTIFIER AND PATTERN SYNTAX
> SG16 mailing list
> SG16_at_[hidden] <mailto:SG16_at_[hidden]>
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16 <https://lists.isocpp.org/mailman/listinfo.cgi/sg16>
> Ext mailing list
> Subscription: https://lists.isocpp.org/mailman/listinfo.cgi/ext
> Link to this post: http://lists.isocpp.org/ext/2020/06/14104.php
SG16 list run by email@example.com