Here is a draft of changes as requested by SG16, Jens, and Hubert
https://isocpp.org/files/papers/D2348R1.pdf

I found it better for /whitespace/ to refer to a single whitespace instead of describing a sequence. I have adjusted the pluralisation of everything accordingly.

I've added some notes to clarify the intent in important places
I've used Hubert's excellent suggestion for phase 1 of translation
I've put back some prose to describe multi-line comments
I made sure whitespace does not appear at the start of a sentence
I introduced the grammar term line-break-character to describe single-codepoint line-breaks (\n, \r) independently of line-breaks sequences (like \r\n)

Hopefully we can take this of the hands of SG16!

Thanks again for the feedback,

Corentin


On Fri, Sep 10, 2021 at 9:36 AM Jens Maurer <Jens.Maurer@gmx.net> wrote:
On 09/09/2021 22.54, Hubert Tong wrote:

>         (2)
>         In the new [lex.whitespaces] subclause, the following is added:
>         whitespaces are ignored except as they serve to separate tokens
>
>         This seems to have come from the text being removed out of [lex.token] (where it was excusable). Whitespace separation is significant in [cpp.replace.general], etc. This sentence should at best be a note in relation to phase 7 of translation.
>
>
>     I am happy to remove it entirely.
>     It's certainly not needed for phase 7. And I think phase 3 wording says something similar.
>
>     In [cpp.pre], Jens objected to my removal of
>
>     > The only whitespace characters that shall appear between preprocessing tokens within a preprocessing directive (from just after the directive-introducing token through just before the terminating new-line character) are space and horizontal-tab (including spaces that have replaced comments or possibly other whitespace characters in translation phase 3).
>
>     I am not able to convince myself than the grammar described at the start at [cpp.pre] allows line-breaks to appear between preprocessing tokens within a preprocessing directive,
>     but I'm happy to replaced the striked paragraph by
>
>     Only /horizontal-whitespace/s shall appear between preprocessing tokens within a preprocessing directive.
>
>
> This is neither necessary nor harmless. The term "preprocessing directive" is defined right after the grammar in [cpp.pre]. Its definition precludes the presence of line-breaks outside of /**/ between preprocessing tokens within the directive. We do want to allow comments in preprocessing directives.

Good point; I was missing [cpp.pre] p1, which seems to say everything we want to say.
So, I'm good with the removal now.

> This seems to point to another problem with the wording:
> (3)
> The removal of the comment replacement in phase 3 means that the definition of preprocessing directives requires an update to qualify that it wants to talk about line-breaks (but not those that are inside /**/ comments).

Indeed, the status quo seems to allow new-lines within /* comments */ inside a preprocessing-directive.

Maybe it would be clearer/easier to retain the replacement of comments with a single space
character in phase 3?  While we do need to differentiate different kinds of whitespace
(horizontal whitespace vs. new-lines) in phase 4, there's no point in talking about comments
separately beyond phase 3.

Jens