Date: Thu, 18 Jun 2020 11:35:20 -0400
On 6/18/20 9:35 AM, Alisdair Meredith via Ext wrote:
> Late question looking over the paper.
>
> How does the restriction of identifiers to follow the Unicode
> specification compare to the C standard Annex D?
The C++ status quo matches C standard annex D. So the new restrictions
would produce the same incompatibility with C as with C++20.
>
> I ask, as I am wondering if we want some more wording
> for C++ Annex C.5 [diff.iso] on which identifiers may (or may
> not) be used in shared code.
Yes, the proposed [diff.cpp20.lex] wording should be duplicated in
[diff.lex] <http://eel.is/c++draft/diff.lex>.
Assuming the proposal is adopted for C++, a proposal for C should also
be submitted. The SG16 issue tracker includes a task to do so at
https://github.com/sg16-unicode/sg16/issues/56.
Tom.
>
> AlisdairM
>
>> On Jun 8, 2020, at 19:14, JF Bastien via Ext <ext_at_[hidden]
>> <mailto:ext_at_[hidden]>> wrote:
>>
>> Hello Ⓔⓥⓞⓛⓤⓣⓘⓞⓝ,
>>
>> Next week on Thursday the 18th at 10AM Pacific we'll be discussing
>> Unicode identifiers. It was on our "tentatively ready" list as of
>> Prague, but received some feedback and has been updated as detailed
>> by Steve below. I'd like us to discuss the changes, and tentatively
>> leave it on the tentatively ready list, so next time we can make
>> decisions we reaffirm that we're forwarding to Core (as per our
>> process <http://wg21.link/p1999>).
>>
>> Updated paper:
>>
>> https://isocpp.org/files/papers/P1949R4.html
>>
>>
>> Here's the GitHub issue:
>>
>> https://github.com/cplusplus/papers/issues/688
>>
>>
>> Meeting information:
>>
>> Zoom Meeting ID 735059607
>> Zoom Meeting Password template
>> Zoom Meeting Room
>> https://iso.zoom.us/j/735059607?pwd=d2tzRkZrTGY1c241R2prOVIrVnNXdz09
>> Zoom Meeting Automatic Phone-In US: +16699006833,,735059607# or
>> +14086380968,,735059607#
>> Zoom Meeting Phone Number US: +1 669 900 6833
>> International numbers available https://iso.zoom.us/u/acPWjSNM0
>>
>>
>> See you then!
>>
>> JF
>>
>> p.s. this is valid C++:
>>
>> int Ⓔⓥⓞⓛⓤⓣⓘⓞⓝ;
>>
>>
>>
>> On Fri, Jun 5, 2020 at 1:37 PM Steve Downey via SG16
>> <sg16_at_[hidden] <mailto:sg16_at_[hidden]>> wrote:
>>
>>
>> Last week SG16 (Text) approved forwarding this paper to EWG for
>> consideration. It addresses fixing the state of allowed
>> identifiers in C++.
>>
>> https://isocpp.org/files/papers/P1949R4.html (also attached as
>> d1949.html)
>>
>>
>> Summary
>>
>> The allowed Unicode code points in identifiers include many that
>> are unassigned or unnecessary, and others that are actually
>> counter-productive. By adopting the recommendations of UAX #31,
>> Unicode Identifier and Pattern Syntax, C++ will be easier to work
>> with in international environments and less prone to accidental
>> problems.
>>
>> This proposal does not address some potential security
>> concerns—so called homoglyph attacks—where letters that appear
>> the same may be treated as distinct. Methods of defense against
>> such attacks are complex and evolving, and requiring mitigation
>> strategies would impose substantial implementation burden.
>>
>> This proposal also recommends adoption of Unicode normalization
>> form C (NFC) for identifiers to ensure that when compared,
>> identifiers intended to be the same will compare as equal. Legacy
>> encodings are generally naturally in NFC when converted to
>> Unicode. Most tools will, by default, produce NFC text.
>>
>> Some unusual scripts require the use of characters as joiners
>> that are not allowed by UAX #31, these will no longer be
>> available as identifiers in C++.
>>
>> As a side-effect of adopting the identifier characters from UAX
>> #31, using emoji in or as identifiers becomes ill-formed.
>>
>>
>> See also
>> https://unicode.org/reports/tr31/ Unicode® Standard Annex #31
>> UNICODE IDENTIFIER AND PATTERN SYNTAX
>>
>>
>> --
>> SG16 mailing list
>> SG16_at_[hidden] <mailto:SG16_at_[hidden]>
>> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>>
>> _______________________________________________
>> Ext mailing list
>> Ext_at_[hidden] <mailto:Ext_at_[hidden]>
>> Subscription: https://lists.isocpp.org/mailman/listinfo.cgi/ext
>> Link to this post: http://lists.isocpp.org/ext/2020/06/14104.php
>
>
> _______________________________________________
> Ext mailing list
> Ext_at_[hidden]
> Subscription: https://lists.isocpp.org/mailman/listinfo.cgi/ext
> Link to this post: http://lists.isocpp.org/ext/2020/06/14240.php
> Late question looking over the paper.
>
> How does the restriction of identifiers to follow the Unicode
> specification compare to the C standard Annex D?
The C++ status quo matches C standard annex D. So the new restrictions
would produce the same incompatibility with C as with C++20.
>
> I ask, as I am wondering if we want some more wording
> for C++ Annex C.5 [diff.iso] on which identifiers may (or may
> not) be used in shared code.
Yes, the proposed [diff.cpp20.lex] wording should be duplicated in
[diff.lex] <http://eel.is/c++draft/diff.lex>.
Assuming the proposal is adopted for C++, a proposal for C should also
be submitted. The SG16 issue tracker includes a task to do so at
https://github.com/sg16-unicode/sg16/issues/56.
Tom.
>
> AlisdairM
>
>> On Jun 8, 2020, at 19:14, JF Bastien via Ext <ext_at_[hidden]
>> <mailto:ext_at_[hidden]>> wrote:
>>
>> Hello Ⓔⓥⓞⓛⓤⓣⓘⓞⓝ,
>>
>> Next week on Thursday the 18th at 10AM Pacific we'll be discussing
>> Unicode identifiers. It was on our "tentatively ready" list as of
>> Prague, but received some feedback and has been updated as detailed
>> by Steve below. I'd like us to discuss the changes, and tentatively
>> leave it on the tentatively ready list, so next time we can make
>> decisions we reaffirm that we're forwarding to Core (as per our
>> process <http://wg21.link/p1999>).
>>
>> Updated paper:
>>
>> https://isocpp.org/files/papers/P1949R4.html
>>
>>
>> Here's the GitHub issue:
>>
>> https://github.com/cplusplus/papers/issues/688
>>
>>
>> Meeting information:
>>
>> Zoom Meeting ID 735059607
>> Zoom Meeting Password template
>> Zoom Meeting Room
>> https://iso.zoom.us/j/735059607?pwd=d2tzRkZrTGY1c241R2prOVIrVnNXdz09
>> Zoom Meeting Automatic Phone-In US: +16699006833,,735059607# or
>> +14086380968,,735059607#
>> Zoom Meeting Phone Number US: +1 669 900 6833
>> International numbers available https://iso.zoom.us/u/acPWjSNM0
>>
>>
>> See you then!
>>
>> JF
>>
>> p.s. this is valid C++:
>>
>> int Ⓔⓥⓞⓛⓤⓣⓘⓞⓝ;
>>
>>
>>
>> On Fri, Jun 5, 2020 at 1:37 PM Steve Downey via SG16
>> <sg16_at_[hidden] <mailto:sg16_at_[hidden]>> wrote:
>>
>>
>> Last week SG16 (Text) approved forwarding this paper to EWG for
>> consideration. It addresses fixing the state of allowed
>> identifiers in C++.
>>
>> https://isocpp.org/files/papers/P1949R4.html (also attached as
>> d1949.html)
>>
>>
>> Summary
>>
>> The allowed Unicode code points in identifiers include many that
>> are unassigned or unnecessary, and others that are actually
>> counter-productive. By adopting the recommendations of UAX #31,
>> Unicode Identifier and Pattern Syntax, C++ will be easier to work
>> with in international environments and less prone to accidental
>> problems.
>>
>> This proposal does not address some potential security
>> concerns—so called homoglyph attacks—where letters that appear
>> the same may be treated as distinct. Methods of defense against
>> such attacks are complex and evolving, and requiring mitigation
>> strategies would impose substantial implementation burden.
>>
>> This proposal also recommends adoption of Unicode normalization
>> form C (NFC) for identifiers to ensure that when compared,
>> identifiers intended to be the same will compare as equal. Legacy
>> encodings are generally naturally in NFC when converted to
>> Unicode. Most tools will, by default, produce NFC text.
>>
>> Some unusual scripts require the use of characters as joiners
>> that are not allowed by UAX #31, these will no longer be
>> available as identifiers in C++.
>>
>> As a side-effect of adopting the identifier characters from UAX
>> #31, using emoji in or as identifiers becomes ill-formed.
>>
>>
>> See also
>> https://unicode.org/reports/tr31/ Unicode® Standard Annex #31
>> UNICODE IDENTIFIER AND PATTERN SYNTAX
>>
>>
>> --
>> SG16 mailing list
>> SG16_at_[hidden] <mailto:SG16_at_[hidden]>
>> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>>
>> _______________________________________________
>> Ext mailing list
>> Ext_at_[hidden] <mailto:Ext_at_[hidden]>
>> Subscription: https://lists.isocpp.org/mailman/listinfo.cgi/ext
>> Link to this post: http://lists.isocpp.org/ext/2020/06/14104.php
>
>
> _______________________________________________
> Ext mailing list
> Ext_at_[hidden]
> Subscription: https://lists.isocpp.org/mailman/listinfo.cgi/ext
> Link to this post: http://lists.isocpp.org/ext/2020/06/14240.php
Received on 2020-06-18 10:38:33