C++ Logo

liaison

Advanced search

Re: [wg14/wg21 liaison] (SC22WG14.19283) adding punctuator tokens

From: Jens Gustedt <jens.gustedt_at_[hidden]>
Date: Fri, 16 Apr 2021 14:40:55 +0200
Joseph,

on Thu, 15 Apr 2021 17:32:16 +0000 you (Joseph Myers
<joseph_at_[hidden]>) wrote:

> In existing versions of C++, translation phase 1 converts characters
> not in the basic source character set to universal character names.
> So any such character gets converted to a universal character name.
> Outside of strings, such UCNs then match the lexical syntax
> production for an identifier, but are outside of the ranges of
> characters permitted in identifiers. This means the use of such
> characters yields an invalid identifier and is generally invalid
> *even inside #if 0*, much like e.g. unmatched ' or " characters are
> invalid even inside #if 0.
>
> The matter of being invalid inside #if 0 is an important one. With
> new language features, normally it's possible to write code with #if
> conditionals on the value of __STDC_VERSION__ or __cplusplus, that
> only uses the new feature if the language version is new enough.
> When a new feature involves text that is invalid inside #if 0, that
> doesn't work. So you can't generally use such characters (in C++), or
> corresponding UCNs (in both C and C++), in such conditional code,
> because that usage is invalid in #if 0 for existing language
> versions; you'd have to put the new-language-version code in an
> entirely separate header, that's included by a #include that itself
> is conditional, so compilers for old language versions don't see the
> new-language-version code at all.

This seems to be an inconsistency between C and C++. If I understand
this correctly, in C extended source characters that the implementation
accepts for identifiers are integrated into identifier tokens. Other
extended source characters would remain as

   "single non-white-space characters that do not lexically match the
   other preprocessing token categories."

So in C such characters should survive lexing as single tokens and can
effectively be excluded by conditional preprocessing.

Jens

-- 
:: INRIA Nancy Grand Est ::: Camus ::::::: ICube/ICPS :::
:: ::::::::::::::: office Strasbourg : +33 368854536   ::
:: :::::::::::::::::::::: gsm France : +33 651400183   ::
:: ::::::::::::::: gsm international : +49 15737185122 ::
:: http://icube-icps.unistra.fr/index.php/Jens_Gustedt ::

Received on 2021-04-16 07:41:02