CWG reviewed US-3-030
today. Minutes here.
I didn't schedule this issue for discussion in SG16 because I didn't think there was anything interesting for SG16 to weigh in on. However, the CWG discussion turned out to be more interesting than I anticipated. I expect that the resolution direction described below has SG16 consensus, but am sending this message to provide an opportunity to object if anyone has concerns.
During the discussion, I noted that one of the primary goals of P2295 (Support for UTF-8 as a portable source file encoding) was to ensure a portable source file. For a source file to be portable, new-line character sequences must be portably recognized as such. The change proposed with the NB comment left the set of character sequences that constitute a new-line unspecified. I expressed a desire to specify which character sequences constitute a new-line. We then discussed which sequences should be recognized and settled on LF and CR+LF. Support for CR on its own was discussed, but it was felt more evidence and motivation should be provided for that case.
The direction to specify LF and CR+LF as new-line character sequences was believed to be consistent with P2348 (Whitespaces Wording Revamp) which both SG16 and EWG have previously approved (see polling records in the corresponding GitHub issue). However, upon reviewing the wording, it looks to me that P2348 does permit CR by itself to constitute a new-line (see the proposed grammar additions for line-break in [lex.whitespaces]). That seems intentional, but isn't discussed in the paper, so I'm not quite sure (the paper does discuss LF+CR but stops short of proposing support for it).
So, if anyone strongly feels that a lone CR in a UTF-8 source
file should be considered a new-line in portable source files,
please respond. Please note that implementors can support whatever
new-line character sequences desired under the "For any other kind
of input file supported by the implementation ..." part of [lex.phases]p1.
Note that the choices made to resolve this issue might require
implementations to make changes (e.g., to recognize new-line
sequences that they don't today).
Tom.