C++ Logo

sg16

Advanced search

Re: [isocpp-core] P2295 Support for UTF-8 as a portable source file encoding

From: William M. (Mike) Miller <"William>
Date: Sat, 11 Jun 2022 10:52:23 -0400
On Sat, Jun 11, 2022 at 4:01 AM Corentin <corentin.jabot_at_[hidden]> wrote:

> New draft, using that wording, except that I'm not touching the end of
> line indicators, so that we can do that in P2348
> https://isocpp.org/files/papers/D2295R6.pdf
>

A couple of comments:

First, I really do not like the extremely repetitive use of the term
"physical source file". As I understand it, "physical" is used to
distinguish the input file from the logical source file resulting from the
Phase 1 mapping. I'd be happy with replacing the term "physical source
file" with "input file" or any other term that maintains the distinction
between pre- and post-Phase-1 source.

(A related point is the use of "physical" in Phase 2 to describe lines. I
think that's incorrect, since we're talking about backslashes and
new-lines, which are post-mapping characters and might be different
characters or, in the case of a new-line, not present at all in the
physical source file. I think it's fine to just talk about "source" in
Phase 2 and drop "physical" altogether.)

My second comment regards new-line characters and end-of-line indicators.
As I understand it, there are two real-world scenarios the existing wording
is intended to cover: cases where different characters or sequences (CR,
CRLF) are used instead of new-lines, and record-oriented files where there
is no character at the end of a line. The word "introducing" is appropriate
for the latter case, but it seems incongruous for the former. Could we
replace that phrase with "representing end-of-line indicators as new-line
characters"?

Received on 2022-06-11 14:52:34