On Wed, Feb 3, 2021 at 10:03 PM Jens Maurer <Jens.Maurer@gmx.net> wrote:

On 03/02/2021 21.45, Corentin wrote:
>
>
> On Wed, Feb 3, 2021 at 9:22 PM Jens Maurer <Jens.Maurer@gmx.net <mailto:Jens.Maurer@gmx.net>> wrote:
>
> On 03/02/2021 19.22, Corentin wrote:
>
> > I thought we had discussed that the standard library has certain
> > facilities with locale-dependent character set.
> > I haven't found a mention of "execution character set" in the library
> > wording, so I'm interested in learning how these locale-dependent
> > character sets are described / referenced.
> >
> >
> > There is a whole new paragraph in the library introduction (page 10).
>
> That paragraph doesn't define the term "execution character set",
> for example.
>
>
> That paragraph is (supposed to be) the definition. these terms are not mentioned before and are introduced in this paragraph which (attempts to) describe them

That paragraph fails in doing that.

> And I have trouble parsing the sentences here. In particular, I
> don't understand to what
> "with the same value in the execution character set"
> refers to ("the same" relative to what?)
>
>
> Same code point value.
> Say your literal encoding is ASCII, the code point value for 'A' is 65, then the execution encoding is such that the code point value of A is also 65.

And that means std::isalpha, for example, will return true?
Are any other functions affected by that constraint?
Where did we have that constraint previously?
Where is the C++20 normative statement for the edited
footnote in [multibyte.string]?

I think the footnote only says that NTBS are NTMBS

And does that mean I can't compile a program with an EBCDIC
compiler (producing EBCDIC literal encoding) and then
running it in an ASCII environment? Or does that just
mean certain functions won't work on literals as
expected, e.g. std::isalpha('a') might not return true?

Certain functions will be UB. They already are, in that is in your scenario isalpha('a') violates the precondition that 'a' is a character in the encoding of the current locale

std::string(runtime_string).find('a') will also return non sense

That constraint is currently not specified but, during execution, the program does not distinguish literals from runtime data, or ordinary literal encoding from execution encoding.

There are just strings assumed to be in execution encoding and if they aren't they violate all of these functions preconditions.

> I struggled a bit with the formulation.
> I'm trying to say that both the execution character set and encoding are ""super sets"" of the literal ones, but "super set" of encoding does not seem like a good formulation.

Where do we say that in the C++20 wording?

We don't. We should. (unless we are happy with isalpha('a') returning false, puts("a") not displaying a and string("a").find('a') returning npos !

But I also don't see where the standard ever admits currently that the execution encoding as defined in [lex] can ever be different from the one used through the library.

I think for the standard they are currently one of the same, and if we want to split execution encoding from literal encoding there should be a description of how they relate to one another

Jens