sg16: Re: [SG16] [wg14/wg21 liaison] Characters literals in preprocessor conditionals

From: Corentin <corentin.jabot_at_[hidden]>
Date: Sun, 14 Jun 2020 15:47:29 +0200

On Sun, Jun 14, 2020, 12:59 Florian Weimer <fw_at_[hidden]> wrote:

> * Tom Honermann via Liaison:
>
> > Copying the WG21/WG14 liason list. For context, refer to C++
> > [cpp.cond]p12 <http://eel.is/c++draft/cpp.cond#12> and the value of
> > character literals in conditional preprocessing directives. Quote:
> >
> >> Whether the numeric value for these character-literals matches the
> >> value obtained when an identical character-literal occurs in an
> >> expression (other than within a #if or #elif directive) is
> >> implementation-defined.
> >> [ Note: Thus, the constant expression in the following #if directive
> >> and if statement ([stmt.if]) is not guaranteed to evaluate to the same
> >> value in these two contexts:
> >>
> >> #if 'z' - 'a' == 25
> >> if ('z' - 'a' == 25)
> >>
> >> — end note
> >> ]
>
> The canonical example is probably signed vs unsigned chars. The rule
> in the standard allows preprocessing to be independent of that.
>
> On the other hand, one cannot use a simple preprocessor conditional
> (such as the '\377' < 0) to tell whether chars are signed or not.
>
> > On 6/11/20 11:40 AM, Corentin via SG16 wrote:
> >> There are 3 use cases:
> >>
> >> * Detect ASCII vs EBCDIC at compile time, using different methods
> >> including #if 'A' == '\301'which I am not sure how it works ( a
> >> more comment attempt is #if 'A' == 65)
> >>
> > \301 == 0xC1 == 'A' in EBCDIC code pages.
> >>
> >> * Comparing with a #define, most frequent pattern being to
> >> compare with a path separator, as a means of compile time
> >> configuration.
> >> * Trying to detect other encodings
>
> I'm pretty sure there are more indirect uses of character constants in
> expressions, with constructs like this:
>
> # if defined TIOCGWINSZ && TIOCGSIZE == TIOCGWINSZ
>
> Where these ioctl constants are traditionally defined like this:
>
> #define TIOCGWINSZ _IOR('t', 104, struct winsize) /* get window size
> */
>
> And eventually this expands to something that performs arithmetic on
> 't'.
>

That still falls back to comparing a literal with another, which always
produce the expected result.
The problematic cases are comparing a literal with a number (independently
of signedness), and comparing the difference between 2 characters literals,
the later of which is trickier.
I am interesting in making sure the value of individual literals is the
same, independently of representation.
I think it is be okay to arithmetic to behave differently.
Mostly, it's already seem to be the expected behavior, but the wording
could be clarified (at least in C++).

The case I am trying to make sure to prevent is literal being converted to
different encodings during different compilation phases

Received on 2020-06-14 08:50:52