This is your friendly reminder that this meeting is taking place tomorrow.
I am finalizing updates to D2572R0 now and will communicate an
updated draft before the meeting. I apologize for the limited
review time. We won't poll forwarding it if any concerns are
There has been considerable discussion about L2/22-072R in other replies within this email thread; please try to read those if you haven't already.
SG16 will hold a telecon on Wednesday, May 25th, at 19:30 UTC (timezone conversion).
The agenda is:
- D2572R0: std::format() fill character allowances
- Continue review pending the availability of an updated revision.
- L2/22-072R: Proposal for amendments to UAX#9 and UAX#31
- Review for familiarity and relevance to P1949: C++ Identifier Syntax using Unicode Standard Annex 31.
L2/22-072R was produced by the Unicode Source Code Ad-Hoc Group and adopted in April into the proposed updates for Unicode 15 per the Draft Minutes of UTC Meeting 171. Thanks are owed to Robin Leroy (CC'd) for bringing this paper to our attention. The paper discusses handling of source code that contains characters that have right-to-left (RTL) directionality. The changes made to UAX#9 (Unicode Bidirectional Algorithm) (in yellow highlight) are concerned with presentation of source code and is therefore more of a concern for SG15 (Tooling) where it would be applicable to compilers (e.g., in diagnostics), editors, code review tools, etc... The changes to UAX#31 (Unicode Identifier and Pattern Syntax) (in yellow highlight) clarify that rule UAX31-R3 is applicable to programming languages and present an example illustrating how use of LEFT-TO-RIGHT MARK (LRM) and RIGHT-TO-LEFT MARK (RLM) as whitespace characters (but not in isolation) may be desirable so that source code rendered as plain text does not present the source code in a confusing or surprising manner. The adopted changes suggest (at least) the following items for us to consider:
- [uaxid.pattern]p2, as added by P1949, states that UAX31-R3 is not applicable to C++ but in light of the updates above, that is not correct. The entry should be updated to state our conformance and possibly declare a profile for our use of Pattern_White_Space and Pattern_Syntax characters.
- Per the example added to UAX31-R3, consider allowing LRM and RLM to appear in whitespace (this would be an additional change to consider on top of P2348: Whitespaces Wording Revamp after C++23 pending updated Unicode guidance).
- Consider proposing recommended display behaviors to SG15; presumably inline with HL4 from UAX#9 section 4.3, "Higher-Level Protocols". My understanding is that Microsoft Visual Studio implements this behavior. Opportunities for diagnostic improvements can be seen at https://godbolt.org/z/MM1xE5dM1 (note that the carat position is not aligned with the identifier it intends to highlight; this is because the code display and carat location are not in sync with regard to how RTL characters affect presentation).