sg16: Re: [SG16] Whitespaces again

From: Hubert Tong <hubert.reinterpretcast_at_[hidden]>
Date: Wed, 22 Sep 2021 00:53:42 -0400

In the grammar for *single-line-comment*, do not require comments to be
non-empty:
// *single-line-comment-elem*
=>
//

In the grammar for *single-line-comment-elem*, fix the category mismatch:
except *line-break*
=>
except a character that begins (and is the whole or a part of) a
*line-break*

or "except a character that matches *line-break-character*"

In the grammar for *multi-line-comment*:
Apply "opt" to *multi-line-comment-elem-seq*.

In [lex.whitespaces]:
A disambiguation rule is required to prefer matching CRLF as a *line-break*
instead of two *line-break*s.

In the grammars for *h-char* and *q-char*, use whatever formulation was
chosen for *single-line-comment-elem*.

Same comment for the grammars of *basic-c-char* and *basic-s-char*.

Same comment for the grammar of *d-char*. Additionally, use "a character
that matches *horizontal-whitespace-character*" instead of the implied "a
character that is *horizontal-whitespace-character*".

The [lex.string], the "*line-break*" in a raw string literal wording could
be more explicit about scanning for line-breaks (sequences matching a
*line-break* is not a *line-break* "for free"; it is a *line-break* if, for
example, the grammar asks for a *line-break*).
This can be done by adding *line-break* under the *r-char* grammar and
adjusting the other *r-char* case with the formula from
*single-line-comment-elem*.

In [cpp.line]:
number of *line-break*
=>
number of *line-break*s

In [lex.pptoken]:
The instances of "non-whitespace character" with respect to the "cannot be
one of the above" case is problematic if the interpretation leaves us with
cases where there are Unicode whitespace characters that are a part of
neither a preprocessing token nor a *whitespace*. That's a new situation,
which the surrounding wording could not be relied upon to handle in a
straightforward manner.

This could be fixed by replacing:
each non-whitespace character that cannot be one of the above
=>
each character that cannot be considered part of a *whitespace* and cannot
be one of the above

This also happens to fix a pre-existing issue that the wording is rather
weak on preferring to interpret comments as comments.

On Tue, Sep 21, 2021 at 1:15 AM Corentin via SG16 <sg16_at_[hidden]>
wrote:

> Dear vertical tabs aficionados,
> Here is the last version of the whitespace paper
> https://isocpp.org/files/papers/P2348R1.pdf
>
> I hope we can move the paper forward tomorrow in as little time as
> possible.
>
> I am afraid that I will have to go through the whole process, which is
> quite unfortunate, but will give more time for people to comment on the
> wording.
>
> I am sure there are still minor issues with it and I truly appreciate the
> feedback but I hope additional comments can be handled before and after
> sg16, which has more important items on the agenda.
>
> Thanks,
>
> Corentin
>
>
> --
> SG16 mailing list
> SG16_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>

Received on 2021-09-21 23:54:12