On May 14, 2025, at 1:37 PM, Corentin Jabot <corentinjabot@gmail.com> wrote:

I said I'd give feedback to Alisdair on P3556R0 before the meeting.

So briefly:

- Using "source file" is fine, ship it.

- For "source text", if we want to distinguish phase 1 and post-phase-1 by using that term (I don't love it, but it seems adequate), I think we are missing a definition for it.

Maybe you could improve by adding a paragraph at the end of phase 1:

> This sequence of translation character set elements is termed the _source text_.

(There are probably less awkward ways to do that, but we mention "sequence of translation character set elements" twice in the last two paragraphs of phase 1)

This paper, for me, does not resolve the confusion of the use of the term "header file" (https://github.com/cplusplus/CWG/issues/665) - but I don't think we necessarily want do that in this paper.

Nit: top of page 9, "source text of the source file" seems redundant

--

P3657R0

I wish we had a grammar for comments. Other than that, ship it.

Thanks for working on this, Alisdair!

On Wed, May 14, 2025 at 5:32 AM Tom Honermann via SG16 <sg16@lists.isocpp.org> wrote:

SG16 will hold a meeting today/tomorrow, Wednesday, May 14th, at 19:30 UTC (timezone conversion).

If you need a .ics file to import into your calendar, you can download it here.

The agenda follows.

P3658R0: Adjust identifier following new Unicode recommendations.

P3556R0: Input files are source files.

P3657R0: A Grammar for Whitespace Characters.

P3658R0, by our good friend Robin Leroy, seeks to adjust the character allowances for identifiers to include a more consistent set of mathematical symbols. This recommendation comes from the UTC in the wake of the adoption of P1949R7 (C++ Identifier Syntax using Unicode Standard Annex 31) for C++23, a paper I'm sure you all remember well. Deployment of P1949 was found to break some existing code that used identifiers containing mathematical symbols that were made invalid by the adoption of P1949R7, but that seemed quite reasonable considering similar identifiers that were not made invalid. The UTC investigated and produced a recommendation for general purpose programming languages as published in UTS #55 (Unicode Source Code Handling). The Unicode stability policy prohibited directly changing the XID_Start and XID_Continue properties, so a Mathematical Compatibility Notation Profile was defined with corresponding ID_Compat_Math_Start and ID_Compat_Math_Continue properties to identify the member characters. The proposed changes are rather straight forward; modify the identifier-start and identifier-continue grammar productions to include characters identified by the new properties.

P3556R0 and P3657R0 come to us courtesy of Alisdair Meredith. These papers are intended to clarify core language wording related to input/source file terminology and the specification of whitespace characters. Both papers are near editorial in nature, but sufficiently complicated to warrant CWG review; SG16 was requested to review since these touch topics near and dear to us. P3556R0 does not include any intended impact to existing implementations. P3657R0 includes two normative changes; it addresses CWG 1655 (Line endings in raw string literals) and it removes a case of IFNDR from [lex.comment]p1 as previously proposed by Corentin in P2348R3 (Whitespaces Wording Revamp).

Tom.

--
SG16 mailing list
SG16@lists.isocpp.org
https://lists.isocpp.org/mailman/listinfo.cgi/sg16
Link to this post: http://lists.isocpp.org/sg16/2025/05/4571.php