sg16: Re: [SG16] [wg14/wg21 liaison] Characters literals in preprocessor conditionals

From: Tom Honermann <tom_at_[hidden]>
Date: Sun, 14 Jun 2020 13:02:05 -0400

On 6/14/20 6:59 AM, Florian Weimer via Liaison wrote:
> * Tom Honermann via Liaison:
>
>> Copying the WG21/WG14 liason list. For context, refer to C++
>> [cpp.cond]p12 <http://eel.is/c++draft/cpp.cond#12> and the value of
>> character literals in conditional preprocessing directives. Quote:
>>
>>> Whether the numeric value for these character-literals matches the
>>> value obtained when an identical character-literal occurs in an
>>> expression (other than within a #if or #elif directive) is
>>> implementation-defined.
>>> [ Note: Thus, the constant expression in the following #if directive
>>> and if statement ([stmt.if]) is not guaranteed to evaluate to the same
>>> value in these two contexts:
>>>
>>> #if 'z' - 'a' == 25
>>> if ('z' - 'a' == 25)
>>>
>>> — end note
>>> ]
> The canonical example is probably signed vs unsigned chars. The rule
> in the standard allows preprocessing to be independent of that.
>
> On the other hand, one cannot use a simple preprocessor conditional
> (such as the '\377' < 0) to tell whether chars are signed or not.

Corentin mentioned elsewhere outside this thread what I think is likely
the primary historical reason for character literal values potentially
being observed with different values at translation phases 4 and 7; A
preprocessor (or any tool that only operates through translation phase
4) may have no need for the concept of an execution character set; such
tools may therefore evaluate character literal valuess (in preprocessor
conditionals) using the numeric representation from the original source
encoding.

We *could* specify that character literals in conditional preprocessing
directives are observed to have values corresponding to the execution
character set. Tools that only process through translation phase 4
could claim source file character set as the execution character set and
be unaffected. The only impact would be to implementations that process
translation phase 1-7 *and* differentiate values in phase 4 and 7. I
don't know if such implementations exist, but this situation can arise
with separate preprocessing and compilation steps. I suspect it isn't
worth trying to do anything here.

Tom.

Received on 2020-06-14 12:05:17