C++ Logo


Advanced search

Re: [SG16] P2348: Feedback on r1 draft

From: Hubert Tong <hubert.reinterpretcast_at_[hidden]>
Date: Tue, 14 Sep 2021 20:13:09 -0400
Corentin, thank you. With the further clarity re: what *whitespace* is
meant to represent, I was able to do a more complete review.

The removal of the first sentence in [lex.token] seems like a drive-by
change that I am not quite understanding the motivation for.

The comment grammar presentation is still, under the existing conventions
of reading, describing only comments with a "payload" of a single
character. Please refer to how *r-char-sequence* is defined for inspiration.

The addition of "*whitespace*s are ignored except as they serve to separate
tokens" in [lex.whitespaces] that was to be removed is still there (and the
sentence still begins with the lowercase grammar term).

The "*whitespace* containing" issue in [cpp.pre] still occurs in another
instance occurring after "followed by" (as opposed to the fixed instance
that occurred after "follows a"):
followed by *whitespace* containing a *line-break*
followed by *whitespace*s containing a *line-break*

The "[only] *horizontal-whitespace*s" sentence in [cpp.pre] should be
removed or made a note (replacing "shall" with "can"). Also, as noted in
the SG 16 meeting, the removal of the existing diagnosable rule *does*
require changes to GCC's -pedantic-errors implementation: The "Do not
perform whitespace replacement" part of the design section of the paper
makes a statement that appears to be contrary to this fact.

For [cpp.replace.general]:
same number, ordering, spelling, and *whitespace* separation, where all
*whitespace* separations are considered identical
same number, ordering, spelling, and separation with *whitespace*s, where
all sequences of one or more *whitespace*s are considered identical

There shall be *whitespace*
There shall be one or more *whitespace*s

Any *whitespace* preceding
Any *whitespace*s preceding

Important (!), in [cpp.stringize]:
Each occurence of *whitespace*
Each sequence of one or more *whitespace*s

In [cpp.line]:
number of *line-break* read or introduced in translation phase 1
number of *line-break*s resulting from translation phase 1

On Mon, Sep 13, 2021 at 6:02 AM Corentin <corentin.jabot_at_[hidden]> wrote:

> Here is a draft of changes as requested by SG16, Jens, and Hubert
> https://isocpp.org/files/papers/D2348R1.pdf
> I found it better for /whitespace/ to refer to a single whitespace instead
> of describing a sequence. I have adjusted the pluralisation of everything
> accordingly.
> I've added some notes to clarify the intent in important places
> I've used Hubert's excellent suggestion for phase 1 of translation
> I've put back some prose to describe multi-line comments
> I made sure whitespace does not appear at the start of a sentence
> I introduced the grammar term line-break-character to describe
> single-codepoint line-breaks (\n, \r) independently of line-breaks
> sequences (like \r\n)
> Hopefully we can take this of the hands of SG16!
> Thanks again for the feedback,
> Corentin
> On Fri, Sep 10, 2021 at 9:36 AM Jens Maurer <Jens.Maurer_at_[hidden]> wrote:
>> On 09/09/2021 22.54, Hubert Tong wrote:
>> > (2)
>> > In the new [lex.whitespaces] subclause, the following is added:
>> > whitespaces are ignored except as they serve to separate tokens
>> >
>> > This seems to have come from the text being removed out of
>> [lex.token] (where it was excusable). Whitespace separation is significant
>> in [cpp.replace.general], etc. This sentence should at best be a note in
>> relation to phase 7 of translation.
>> >
>> >
>> > I am happy to remove it entirely.
>> > It's certainly not needed for phase 7. And I think phase 3
>> wording says something similar.
>> >
>> > In [cpp.pre], Jens objected to my removal of
>> >
>> > > The only whitespace characters that shall appear between
>> preprocessing tokens within a preprocessing directive (from just after the
>> directive-introducing token through just before the terminating new-line
>> character) are space and horizontal-tab (including spaces that have
>> replaced comments or possibly other whitespace characters in translation
>> phase 3).
>> >
>> > I am not able to convince myself than the grammar described at the
>> start at [cpp.pre] allows line-breaks to appear between preprocessing
>> tokens within a preprocessing directive,
>> > but I'm happy to replaced the striked paragraph by
>> >
>> > Only /horizontal-whitespace/s shall appear between preprocessing
>> tokens within a preprocessing directive.
>> >
>> >
>> > This is neither necessary nor harmless. The term "preprocessing
>> directive" is defined right after the grammar in [cpp.pre]. Its definition
>> precludes the presence of line-breaks outside of /**/ between preprocessing
>> tokens within the directive. We do want to allow comments in preprocessing
>> directives.
>> Good point; I was missing [cpp.pre] p1, which seems to say everything we
>> want to say.
>> So, I'm good with the removal now.
>> > This seems to point to another problem with the wording:
>> > (3)
>> > The removal of the comment replacement in phase 3 means that the
>> definition of preprocessing directives requires an update to qualify that
>> it wants to talk about line-breaks (but not those that are inside /**/
>> comments).
>> Indeed, the status quo seems to allow new-lines within /* comments */
>> inside a preprocessing-directive.
>> Maybe it would be clearer/easier to retain the replacement of comments
>> with a single space
>> character in phase 3? While we do need to differentiate different kinds
>> of whitespace
>> (horizontal whitespace vs. new-lines) in phase 4, there's no point in
>> talking about comments
>> separately beyond phase 3.
>> Jens

Received on 2021-09-14 19:13:39