On Wed, 17 Jun 2020 at 18:00, Steve Downey <sdowney@gmail.com> wrote:

Stringizing also reaches back and can distinguish how a token was spelled. http://eel.is/c++draft/cpp.stringize#2.sentence-6 I think in practice this only shows up for escape sequences, though? I'm not able to construct an example quickly.

I think spelling refers to the sequence of abstract characters, rather than the encoded bytes values.

(And I think Tom mentioned than in the hypothetical drawing of code scenario, you can't ever rollback to whatever the original thing was)

In anycase, I think all of these corner cases go away if we don't convert escape sequences in phase 1

On Wed, Jun 17, 2020 at 10:35 AM Corentin Jabot <corentinjabot@gmail.com> wrote:

On Wed, 17 Jun 2020 at 01:20, Steve Downey via SG16 <sg16@lists.isocpp.org> wrote:
My priorities, possibly not exhaustive

1) be able to have sensible discussions about the encodings of literals in the standard

2) use the same terminology we would use in describing Library facilities today

3) fix the hand waving in raw literals and any other place that in practice compilers access the logical source file.

I don't believe compilers do that in practice - the only observable difference between raw literals and non raw literals in practice
is how universal-character-name *escape sequences* are handled (as well as line splicing), the encodings involve are / should be the same
and the wording should be a better job at describing this intent as too many destructive operations are done in phase 1 and 2

4) be able to avoid "standard" terms that aren't actually in the standard, such as 'execution encoding', by having actual terminology.

5) remove the conversion to universal-character-name while keeping that as an escape sequence. Use notional code points instead, which cleans up accidentally forming a ucn.
+1000

Strongly agree with all of that!

--
SG16 mailing list
SG16@lists.isocpp.org
https://lists.isocpp.org/mailman/listinfo.cgi/sg16