I (and SG16 in general) have been using the term "execution character set" and "execution encoding" to refer to both the encoding known at compile-time that is used to encode character and string literals and the locale dependent encoding specified by the LC_CTYPE locale category that is used at run-time by the character classification and conversion functions.  When necessary to avoid confusion, I've been referring to the former as the "presumed execution encoding" and the latter as simply the "run-time execution encoding".

A discussion [1] with user 'alfps' on an r/cpp Reddit thread alerted me to the possibility that I/we have been using this term incorrectly.  I spent some time looking at both the C and C++ standards and there does appear to be evidence that "execution character set" (encoding) refers solely to the encoding known at compile-time that is used to encode literals.  But there doesn't seem to be a clear term defined for the locale dependent run-time encoding that governs the behavior of the character classification and conversion functions.  There is some evidence for this encoding being referred to using the term "native".

From the C++ standard:

  1. [fs.path.type.cvt]p1: (though the definition provided here appears to be specific to path names).
    "The native encoding of an ordinary character string is the operating system dependent current encoding for path names.  The native encoding for wide character strings is the implementation-defined execution wide-character set encoding."
  2. [fs.path.type.cvt]p2.1: (This paragraph, the next one, and p8 (not listed here) constitute the only uses of "native (ordinary|wide) encoding" in the C++ standard).
    "char: The encoding is the native ordinary encoding. ..."
  3. [fs.path.type.cvt]p2.2:
    "wchar_­t: The encoding is the native wide encoding. ..."
  4. [locale.codecvt]p3:
    "The specializations required in Table 101 ([locale.category]) convert the implementation-defined native character set. ... codecvt<wchar_­t, char, mbstate_­t> converts between the native character sets for ordinary and wide characters. ..."
  5. [locale.ctype]p2:
    "The specializations required in Table 101 ([locale.category]), namely ctype<char> and ctype<wchar_­t>, implement character classing appropriate to the implementation's native character set."

As far as I can tell, none of the highlighted terms above appear in the C17 standard, but "native environment" appears in a related wording:

C17 suggests that "extended character set" may also be the right term:

However, the C++ standard states (non-normatively) that the "extended character set" extends the basic source character set and (normatively) that it applies to both the source and execution character sets:

So, what term should we be using here?  Perhaps a core issue should be opened for this?  A brief search didn't reveal an existing one.

(note: you may need to click "continue this thread" when reading the Reddit thread to see all relevant comments).


[1]: https://www.reddit.com/r/cpp/comments/bfyp6x/overview_of_stdfilesystem_my_talk/