On Wed, Jun 10, 2020 at 3:08 PM Corentin Jabot <corentinjabot@gmail.com> wrote:


On Wed, 10 Jun 2020 at 21:02, Hubert Tong via SG16 <sg16@lists.isocpp.org> wrote:
On Wed, Jun 10, 2020 at 2:51 PM Tom Honermann <tom@honermann.net> wrote:
On 6/10/20 1:28 PM, Hubert Tong wrote:

This may be useful for an intermediate stage of the process of updating the wording:

Basic source character set:
set of abstract characters used for the description of source code for the purposes of this document

I think the "basic source character set" fails to be a "character set". I believe the elements of a character set are mappings of values to abstract characters.

I agree with your characterization of a "character set".  But, the members of the "basic source character set" arguably have values on an implementation-defined basis.  Per [cpp.cond]p12, the values of character literals used in a conditional preprocessing directive may have values different from their values in the execution character set; and if they don't correspond to the values of the basic source character set members, then I don't know what else they might correspond to.  This may be indicative of the existence of an additional, currently unnamed, conditionally-defined character set.

Sounds like the last there, except we can simply go with implementation-defined. If could happen to be the same as the execution character set.

Agreed that going with making these comparison implementation-defined there might be the best option. Cleaning that up might be breaking, especially as ebcdic encodings
may not have contiguous values for a-z, A-Z. Would definitively require some research

If we could fix that properly, converting to execution encoding before doing the computation would make sense.
I don't know what the effect would be, but we can consider deprecating the use of character values in preprocessing.