C++ Logo

sg16

Advanced search

Re: [SG16] Revised and repaired P1949R2 - C++ Identifier Syntax using Unicode Standard Annex 31

From: Tom Honermann <tom_at_[hidden]>
Date: Mon, 24 Feb 2020 13:03:20 -0500
On 2/24/20 12:50 PM, JF Bastien wrote:
>
>
> On Mon, Feb 24, 2020 at 8:08 AM Hubert Tong via SG16
> <sg16_at_[hidden] <mailto:sg16_at_[hidden]>> wrote:
>
>> - Was EWG informed that the proposal can be understood as
>> introducing new core language undefined behaviour for
>> existing programs with identifiers in NFC form where the
>> concatenation is not in NFC form ([cpp.concat])? I note that
>> R1 of the paper was not clear on that point and R2 does not
>> identify it as a consideration.
> I don't recall discussing this in EWG, but I believe it was
> discussed at some point and the conclusion was that an
> identifier not in NFC form resulting from such concatenation
> was ill-formed. Can you elaborate as to how the UB manifests?
>
> The typical manifestation of the UB in cpp.concat that does not
> produce a diagnostic for the case in question is that the
> concatenation may instead leave the operand tokens behind. That
> is, the ill-formed result is not required to manifest.
>
>
> Per our new process with SG12, we need to document and add rationale
> to any new UB. Let's make sure that's in the proposal.

I'm not sure that there is new UB here. If I understand Hubert's point
correctly, it is that, if the use of the preprocessing ## operator does
not produce a valid preprocessing token, that the behavior is
undefined. In the case under discussion, the question is then, is a
token that is not spelled in NFC form (potentially produced by pasting
two NFC tokens together) a valid preprocessing token?

The paper discusses identifiers. Do we need to address tokens as well?

Tom.


Received on 2020-02-24 12:06:03