Steve, I'm assuming the motivation for this email was the claim in the abstract for P1953 that SG16 is "looking at extending the basic character set"? Regardless, that part of the paper should be corrected. Unicode identifiers have been valid since C++11; we're actually looking at adding more restrictions as opposed to extending the allowed characters.

Corentin, another minor correction: in the primer section, characters are converted, not tokens. Continuing this pedantic streak, the basic source character set also contains space and a few control characters (http://eel.is/c++draft/lex.charset#1).

Tom.

On 2/10/20 12:19 PM, Steve Downey via SG16 wrote:

It's worth noting that identifiers can include unicode characters today via universal character names. it's unwieldy and therefore uncommon, but possible.

SG16 is looking to regularize the use of unicode characters in identifiers via TR31, but they are already allowed.

Not following TR31, particularly normalized forms for comparison will make reflection and reification infinitely harder.