I think it's clear now that we don't have an answer to Tom's question in the title. And that the standard's language is both vague and archaic in this area.
I think, before we make observable behavior changes, it would be worthwhile to respecify [lex] using more modern language, particularly distinguishing codepoints and encodings, and avoiding 'character', as being misleading.
As observed in [filesystem] unsigned and signed char do not have associated encodings. Moving that forward into the front matter might be useful. As well as providing a (better) name for presumed narrow/wide character execution encoding and a better name for current default locale associated narrow/wide character encoding.
I'm willing to take a stab at it.
Execution encoding is a term we use in conversation, it's not actually a term in the standard. The standard speaks of execution character sets, the values of which are determined by locale. Which locale is not specified.
On 8/14/19 10:57 AM, Peter Dimov wrote:
> Tom Honermann wrote:
>> On 8/14/19 3:54 AM, Peter Dimov wrote:
>>> Tom Honermann wrote:
>>>> I think we *might* be successful in using "execution encoding" to
>>>> apply to both the compile-time and run-time encodings by extending the
>>>> term with specific qualifiers; e.g., "presumed execution encoding" and
>>>> "run-time/system/native execution encoding".
>>> This would be implying that there's a single "execution" or "native"
>>> encoding, whereas there are many.
>>> - encoding used for character literals
>> I made the "presumed execution encoding" distinction specifically for this
> Right, and I am saying that calling all the encodings "<adjective> execution
> encoding" implies that they are if not the same, then somehow related, and
> they aren't.
Ok, that is a fair critique.
> I would call the encoding used for narrow character literals "narrow literal
> encoding" and the encoding used for wide character literals "wide literal
> encoding". This is what they are.
I feel some reluctance to changing a term that has been around for so
long, and this strikes me as too specific. There are other constructs
that are also encoded according to the (presumed) execution encoding.
For example source locations exposed via the __FILE__ macro, function
names exposed via __func__, etc..
We don't know at compile-time how encoded literals will be used at
run-time. They may be passed to the locale sensitive character
conversion functions, used as filenames, written to a terminal, etc...
All of these encodings are not known until run-time. I kind of like the
use of "presumed execution encoding" as indicating a compatible subset
of all of the encodings used at run-time.
> "Execution encoding" made sense when a program was, say, written in
> Krasnoyarsk and intended to be executed in Kuala Lumpur. A Krasnoyarsk
> machine used the Krasnoyarsk encoding for everything, and a Kuala Lumpur
> machine used the Kuala Lumpur encoding for everything. Hence source and
It still very much makes sense when cross-compiling today.
Core mailing list
Link to this post: http://lists.isocpp.org/core/2019/08/7062.php