C++ Logo

sg16

Advanced search

Re: [isocpp-sg16] Confirming my understanding of Unicode source files

From: Alisdair Meredith <alisdairm_at_[hidden]>
Date: Sun, 30 Jun 2024 14:00:21 -0400
Re splicing: there is some disagreement between current vendors,
but it seems to be supported:
   https://godbolt.org/z/G9TYohccb

That said, no-one is normalizing their Unicode, so the identifiers
appear to be distinct.

AlisdairM

> On 30 Jun 2024, at 11:52, Steve Downey <sdowney_at_[hidden]> wrote:
>
> The translation from a source file, unlike in C, is implementation defined and happening in phase 0. While we did manage to say you are supposed to accept a UTF-8 file, I don't think we said you must reject an I'll formed one.
> That said, splicing, like other preprocessor things can't form new tokens out of the tokens it's working with, so I'm pretty sure that's ill formed outside a string literal. Inside a string literal, an initial combining sequence would combine with the trailing bit of whatever its pasted together with, but there would also need to be intervening quotes.
>
> foo\
> \u{0308} =
>
> Is not foƶ =
>
> For the same reasons that
> fo\
> o =
>
> Is not foo =
>
> And hopefully my phone has not "helped" me by breaking that all somehow.
>
>
> On Sun, Jun 30, 2024, 11:37 Alisdair Meredith via SG16 <sg16_at_[hidden]> wrote:
> If I have an implementation that accepts only valid UTF-8 encoded
> source files, is the following correct:
>
> If I have a line-splice in the middle of the encoding of a UTF-8 code
> point, then I have a badly encoded source file that should be rejected.
>
> If I have a line-splice in the middle of an identifier separating a combining
> character (such as an accent) from the character it combines with, that
> should be valid as the combining character is expected to modify the next
> element in the source file, not the line-splice character, as the line-splice
> token is direction to the translator on how to proceed to that next element.
>
> AlisdairM
> --
> SG16 mailing list
> SG16_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16

Received on 2024-06-30 18:00:36