C++ Logo


Advanced search

Re: [wg14/wg21 liaison] (SC22WG14.19267) adding punctuator tokens

From: Joseph Myers <joseph_at_[hidden]>
Date: Thu, 15 Apr 2021 17:32:16 +0000
In existing versions of C++, translation phase 1 converts characters not
in the basic source character set to universal character names. So any
such character gets converted to a universal character name. Outside of
strings, such UCNs then match the lexical syntax production for an
identifier, but are outside of the ranges of characters permitted in
identifiers. This means the use of such characters yields an invalid
identifier and is generally invalid *even inside #if 0*, much like e.g.
unmatched ' or " characters are invalid even inside #if 0.

The matter of being invalid inside #if 0 is an important one. With new
language features, normally it's possible to write code with #if
conditionals on the value of __STDC_VERSION__ or __cplusplus, that only
uses the new feature if the language version is new enough. When a new
feature involves text that is invalid inside #if 0, that doesn't work.
So you can't generally use such characters (in C++), or corresponding UCNs
(in both C and C++), in such conditional code, because that usage is
invalid in #if 0 for existing language versions; you'd have to put the
new-language-version code in an entirely separate header, that's included
by a #include that itself is conditional, so compilers for old language
versions don't see the new-language-version code at all.

Punctuator pp-tokens that are safe to add because they don't introduce
this issue (although they could still have compatibility issues if they
affect the interpretation of existing valid code) involve only characters
in the basic source character set other than ' and ".

Joseph S. Myers

Received on 2021-04-15 12:32:24