On 11/8/22 10:32 AM, William M. (Mike) Miller wrote:
On Tue, Nov 8, 2022 at 12:41 AM Tom Honermann via Core <core@lists.isocpp.org> wrote:

Thanks, Corentin.

I agree that, if ~all existing implementations already treat a lone CR as a new-line, then we might as well standardize it. However, if some don't, then we'll be adding a (probably small) implementation burden for something that I suspect is rare. LF and CR+LF are common occurrences. Do you have data that shows that lone CR is 1) recognized by ~all existing implementations, and 2) is used sufficiently often that it is worth standardizing? Do we want to encourage use of lone CR as a portable new-line? As mentioned, implementations can still support it regardless. Unicode also recognizes U+0085 (NEXT LINE), U+2028 (LINE SEPARATOR), and U+2029 (PARAGRAPH SEPARATOR) as line-break characters.

I think it would be worth adding such analysis to a future revision of P2348.

In the interest of time, is anyone opposed to the CWG direction of requiring both LF and CR+LF in portable UTF-8 source files for C++23 with support for other new-line sequences left to a future standard?


Actually, CWG changed direction in the late afternoon session and decided to accept CR as a line-termination character. I'm about to upload drafting implementing that direction for discussion today.

Ah, thank you, I'm sorry I missed that discussion.

That change resolves the inconsistency with P2348 given Corentin's explicit claim of the intent in that paper.

I'm personally happy with this new direction so long as implementors have no concerns (and it seems we already have confirmation that EDG and Clang have no concerns).

Given that we already had consensus for P2348 in SG16 and EWG, assuming no new objections are raised, ship it.

Tom.


I don't know about the ubiquity of that support, but the EDG front end has it as a build-time configuration option that customers can enable or not, as they choose. Here's the description of the flag (note that it cites gcc's processing as its basis):

/*
Flag that is TRUE to indicate that carriage return or carriage return
followed by newline can be used as a line terminator in GNU-compatible
modes.  This feature is provided to allow files with old MacOS line
terminators to be accepted.  The implementation is compatible with the way
in which the GNU compiler handles such line terminators.  It is disabled by
default because it is not required by most users.
*/