Whether the numeric value for these character-literals matches the value obtained when an identical character-literal occurs in an expression (other than within a #if or #elif directive) is implementation-defined.
[ Note: Thus, the constant expression in the following #if directive and if statement ([stmt.if]) is not guaranteed to evaluate to the same value in these two contexts:
#if 'z' - 'a' == 25
if ('z' - 'a' == 25)
— end note
I did some research of the use of #if <boolean expression involving character literals> on the open source libraries in vcpkg (aggregating 90 millions lines of C and C++ code)
\301 == 0xC1 == 'A' in EBCDIC code pages.
There are 3 use cases:
- Detect ASCII vs EBCDIC at compile time, using different methods including #if 'A' == '\301' which I am not sure how it works ( a more comment attempt is #if 'A' == 65)
- Comparing with a #define, most frequent pattern being to compare with a path separator, as a means of compile time configuration.
- Trying to detect other encodingsIn total this feature is used less than 80 times (lots of duplication)My method is crude so my numbers are not precise (but that's the ballpark)
I found one use of #if (L'\0' - 1 < 0), which is fun, and #if WCHAR_PATH_SEPARATOR != L'/' no other use of prefixes.
There is a very strong expectation that these literals are in the execution encoding.
Agreed.My tests with various compilers present on godbolt confirm that implementations do convert these characters literals as they would in phase 5.
All but one of these usages seem to be in C libraries, and C headers.The wording states:
This includes interpreting character-literals, which may involve converting escape sequences into execution character set members.Whether the numeric value for these character-literals matches the value obtained when an identical character-literal occurs in an expression (other than within a #if or #elif directive) is implementation-defined.
I am not sure that deprecating this feature is necessary, even if the C++ use case is better covered by https://wg21.link/p1885r2.
But, it does seem valuable to modify the wording to specify that these literals are interpreted as they would be in C expressions, as these are used by libraries intended to be portable. If the detection is wrong the code compiles nevertheless and has the wrong runtime behavior.
Agreed. It would be interesting to know if WG14 members have
some historical perspective on this being implementation-defined.
Perhaps Aaron can assist in getting an answer to that question.
Copying the liason list...