We can certainly reconsider this state of affairs
(in particular, we can make "new-line" a lexing
element that is some character sequence), which
would allow/require retaining the exact shape
of the character sequence for raw string literals,
but that's not what compilers current do, I think.
(But maybe that's a bug.)
> In any case I think we want to specify what a _whitespace_ is as a grammar element and replace all mention of whitespace, whitespaces, whitespace characters by /whitespace./
Sounds reasonable.
> For simplicity, it's probably useful to define /horizontal-whitespace/ and /whitespace, /maybe in [lex.token]
>
> /horizontal-whitespace/
> /horizontal-whitespace/
> SPACE
> HORIZONTAL TAB
>
> /whitespace/
> / horizontal-whitespace/
> LINE FEED
>
> If we want to keep exact line terminators in phase 1, we can do the same for new-line (note, there is currently a grammar production for new-line in [cpp]: /new-line/: the new-line character)
>
> We could simplify further by adding comments to whitespaces, but there is no grammar for that :(
We could add some grammar.
I spent quite a bit of time on that.
After some reflection I decided to conserve line-break as a grammar element instead of referring to LINE FEED directly.
I decided to use the term line-break so that it doesn't collide with new-line in string literals.
While string-literals use LINE FEED for new-line, I think it's valid for that to be mapped to for example NEXT LINE in phase 5, so we probably want to keep the term new-line,
as it is later referred to in the library part (to mean whatever line feed maps to, rather than specifically line feed).
(of course, it's an early draft, but I am hoping both SG16 and core would like the direction)
Jens