C++ Logo

sg16

Advanced search

Re: [SG16] Revised and repaired P1949R2 - C++ Identifier Syntax using Unicode Standard Annex 31

From: JF Bastien <cxx_at_[hidden]>
Date: Mon, 24 Feb 2020 10:11:01 -0800
On Mon, Feb 24, 2020 at 10:03 AM Tom Honermann <tom_at_[hidden]> wrote:

> On 2/24/20 12:50 PM, JF Bastien wrote:
>
>
>
> On Mon, Feb 24, 2020 at 8:08 AM Hubert Tong via SG16 <
> sg16_at_[hidden]> wrote:
>
>> - Was EWG informed that the proposal can be understood as introducing new
>>> core language undefined behaviour for existing programs with identifiers in
>>> NFC form where the concatenation is not in NFC form ([cpp.concat])? I note
>>> that R1 of the paper was not clear on that point and R2 does not identify
>>> it as a consideration.
>>>
>>> I don't recall discussing this in EWG, but I believe it was discussed at
>>> some point and the conclusion was that an identifier not in NFC form
>>> resulting from such concatenation was ill-formed. Can you elaborate as to
>>> how the UB manifests?
>>>
>> The typical manifestation of the UB in cpp.concat that does not produce a
>> diagnostic for the case in question is that the concatenation may instead
>> leave the operand tokens behind. That is, the ill-formed result is not
>> required to manifest.
>>
>
> Per our new process with SG12, we need to document and add rationale to
> any new UB. Let's make sure that's in the proposal.
>
> I'm not sure that there is new UB here. If I understand Hubert's point
> correctly, it is that, if the use of the preprocessing ## operator does not
> produce a valid preprocessing token, that the behavior is undefined. In
> the case under discussion, the question is then, is a token that is not
> spelled in NFC form (potentially produced by pasting two NFC tokens
> together) a valid preprocessing token?
>
> The paper discusses identifiers. Do we need to address tokens as well?
>
I was just sending a reminder that *if* we add UB then we need to follow
the new process. We could also *not* add UB :-)

Received on 2020-02-24 12:13:54