On Mon, Feb 24, 2020 at 8:08 AM Hubert Tong via SG16 <sg16@lists.isocpp.org> wrote:
I don't recall discussing this in EWG, but I believe it was discussed at some point and the conclusion was that an identifier not in NFC form resulting from such concatenation was ill-formed. Can you elaborate as to how the UB manifests?- Was EWG informed that the proposal can be understood as introducing new core language undefined behaviour for existing programs with identifiers in NFC form where the concatenation is not in NFC form ([cpp.concat])? I note that R1 of the paper was not clear on that point and R2 does not identify it as a consideration.
The typical manifestation of the UB in cpp.concat that does not produce a diagnostic for the case in question is that the concatenation may instead leave the operand tokens behind. That is, the ill-formed result is not required to manifest.
Per our new process with SG12, we need to document and add rationale to any new UB. Let's make sure that's in the proposal.
I'm not sure that there is new UB here. If I understand Hubert's point correctly, it is that, if the use of the preprocessing ## operator does not produce a valid preprocessing token, that the behavior is undefined. In the case under discussion, the question is then, is a token that is not spelled in NFC form (potentially produced by pasting two NFC tokens together) a valid preprocessing token?
The paper discusses identifiers. Do we need to address tokens as well?
Tom.