liaison: Re: [wg14/wg21 liaison] adding punctuator tokens

From: Tom Honermann <tom_at_[hidden]>
Date: Mon, 19 Apr 2021 00:51:26 -0400

On 4/16/21 2:55 AM, Jens Gustedt via Liaison wrote:
> Steve,
>
> on Thu, 15 Apr 2021 14:16:20 -0400 you (Steve Downey via Liaison
> <liaison_at_[hidden]>) wrote:
>
>> Middle dot (U+00B7) is currently an identifier, and remains so in C++
>> with P1947, so it shouldn't be an issue. Proportion (U+2237) is part
>> of Mathematical Symbols rather than punctuation, and isn't in the
>> identifier list, so isn't going to conflict.
>> However, giving alternate spellings for code is inevitably going to
>> cause confusion. I don't believe we really need more ways of spelling
>> the same thing. Digraphs solve a narrow technical problem, but the
>> most common use is creating confusing compiler errors.
> Digraphs are clearly meant to help people/platforms in a transition
> phase, not as an eternal status quo. In that sense I think the
> existing ones worked reasonably well.
>
> For the new ones my long term expectation would be the same. Long term
> for C++ probably here means a different thing than for C, because C++
> already forces better Unicode support. In C we have platforms that are
> still using trigraphs in essential parts of their software
> architecture, so this would probably take much longer.

Digraphs (and other alternative tokens) are present to allow the use of
character sets that lack representation for some members of the basic
source character set. Specifically, those characters that are not
members of the invariant subset of EBCDIC. I don't think these were
intended as a transition aide given that there seems to have been no
intent to further constrain acceptable character sets at a later date.

Despite C++ no longer formally recognizing trigraphs, that doesn't
prohibit their use in C++. The status quo is that trigraphs are still
permitted under the implementation-defined translation phase 1 character
mappings. Thus, while programmers can no longer rely on them being
recognized in portable code, implementations can continue to support
them and remain conforming.

I'm not aware of anything in C++ that forces better Unicode support.
WG21's SG16 is working to improve support for Unicode, but not with the
intent to exclude support for legacy character sets.

Tom.

Received on 2021-04-18 23:51:31