Date: Mon, 10 Feb 2020 18:17:40 -0500
Yes, that's right. We're not looking to change the basic source or
execution character set, which are just the minimal set necessary for
writing and running C++.
But also that this isn't a problem that is being introduced. It's
technically a problem now, and becoming worse as implementations make it
easier to use characters outside the basic set.
On Mon, Feb 10, 2020, 17:57 Tom Honermann <tom_at_[hidden]> wrote:
> Steve, I'm assuming the motivation for this email was the claim in the
> abstract for P1953 that SG16 is "looking at extending the basic character
> set"? Regardless, that part of the paper should be corrected. Unicode
> identifiers have been valid since C++11; we're actually looking at adding
> more restrictions as opposed to extending the allowed characters.
>
> Corentin, another minor correction: in the primer section, characters are
> converted, not tokens. Continuing this pedantic streak, the basic source
> character set also contains space and a few control characters (
> http://eel.is/c++draft/lex.charset#1).
>
> Tom.
>
> On 2/10/20 12:19 PM, Steve Downey via SG16 wrote:
>
> It's worth noting that identifiers can include unicode characters today
> via universal character names. it's unwieldy and therefore uncommon, but
> possible.
>
> SG16 is looking to regularize the use of unicode characters in identifiers
> via TR31, but they are already allowed.
>
> Not following TR31, particularly normalized forms for comparison will make
> reflection and reification infinitely harder.
>
>
>
execution character set, which are just the minimal set necessary for
writing and running C++.
But also that this isn't a problem that is being introduced. It's
technically a problem now, and becoming worse as implementations make it
easier to use characters outside the basic set.
On Mon, Feb 10, 2020, 17:57 Tom Honermann <tom_at_[hidden]> wrote:
> Steve, I'm assuming the motivation for this email was the claim in the
> abstract for P1953 that SG16 is "looking at extending the basic character
> set"? Regardless, that part of the paper should be corrected. Unicode
> identifiers have been valid since C++11; we're actually looking at adding
> more restrictions as opposed to extending the allowed characters.
>
> Corentin, another minor correction: in the primer section, characters are
> converted, not tokens. Continuing this pedantic streak, the basic source
> character set also contains space and a few control characters (
> http://eel.is/c++draft/lex.charset#1).
>
> Tom.
>
> On 2/10/20 12:19 PM, Steve Downey via SG16 wrote:
>
> It's worth noting that identifiers can include unicode characters today
> via universal character names. it's unwieldy and therefore uncommon, but
> possible.
>
> SG16 is looking to regularize the use of unicode characters in identifiers
> via TR31, but they are already allowed.
>
> Not following TR31, particularly normalized forms for comparison will make
> reflection and reification infinitely harder.
>
>
>
Received on 2020-02-10 17:20:30