Cameron forwarded the following to me after our meeting yesterday.

C#'s grammar for identifiers is defined at the following link.  Basically, identifiers match UAX#31 with a few additions (most of which make sense for C++ as well).
- https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/language-specification/lexical-structure#identifiers

A fun blog post exploring C# Unicode identifiers and the possibility of character classification changing over time resides at the following link. 
- https://codeblog.jonskeet.uk/2014/12/01/when-is-an-identifier-not-an-identifier-attack-of-the-mongolian-vowel-separator/

Tom.

On 5/15/19 1:39 PM, Tom Honermann wrote:
Thanks for bringing this to our attention.  I agree there are opportunities for improvement here.  I filed a new SG16 issue to track this.

https://github.com/sg16-unicode/sg16/issues/48

I encourage anyone interested in this to sign up to write a paper or provide additional background material in the issue (e.g., more history about the current list of ranges, an analysis of UAX#31 and its applicability to C++, etc...).

Tom.

On 5/10/19 12:43 PM, JF Bastien wrote:
Hi C++ પกٱƈѻɗﻉ ḟäṅṡ 👋!

The current list of valid identifier characters is pretty silly (see [lex.name] 5.10 Identifiers or cppreference summary). It allows characters such as zero-width joiner and zero-width space among a few silly things (see how bad this can get, h/t Richard Kogelnig).

I asked where it came from, and IIUC John looked at Unicode and cobbled the list of valid ranges manually. That ain't great.

Is this group interested in fixing things?

There's already an existing standard for this, maybe it's a thing we can adopt as-is or use as a starting point:

Further, the tooling group was just talking about module names. I think we should allow any valid identifier name as module name, and look at how this could map to file names for a tooling TR's purpose.

Thanks,

J̙̘̗̘̟͐̀̎F͚̜͈̖͉̗̘̊

_______________________________________________
SG16 Unicode mailing list
Unicode@isocpp.open-std.org
http://www.open-std.org/mailman/listinfo/unicode



_______________________________________________
SG16 Unicode mailing list
Unicode@isocpp.open-std.org
http://www.open-std.org/mailman/listinfo/unicode