Last week SG16 (Text) approved forwarding this paper to EWG for consideration. It addresses fixing the state of allowed identifiers in C++.

https://isocpp.org/files/papers/P1949R4.html (also attached as d1949.html)

Summary

The allowed Unicode code points in identifiers include many that are unassigned or unnecessary, and others that are actually counter-productive. By adopting the recommendations of UAX #31, Unicode Identifier and Pattern Syntax, C++ will be easier to work with in international environments and less prone to accidental problems.

This proposal does not address some potential security concerns—so called homoglyph attacks—where letters that appear the same may be treated as distinct. Methods of defense against such attacks are complex and evolving, and requiring mitigation strategies would impose substantial implementation burden.

This proposal also recommends adoption of Unicode normalization form C (NFC) for identifiers to ensure that when compared, identifiers intended to be the same will compare as equal. Legacy encodings are generally naturally in NFC when converted to Unicode. Most tools will, by default, produce NFC text.

Some unusual scripts require the use of characters as joiners that are not allowed by UAX #31, these will no longer be available as identifiers in C++.

As a side-effect of adopting the identifier characters from UAX #31, using emoji in or as identifiers becomes ill-formed.


See also
https://unicode.org/reports/tr31/  Unicode® Standard Annex #31 UNICODE IDENTIFIER AND PATTERN SYNTAX