sg16: Re: [SG16-Unicode] [isocpp-core] What is the proper term for the locale dependent run-time character set/encoding used for the character classification and conversion functions?

From: Peter Dimov <pdimov_at_[hidden]>
Date: Wed, 14 Aug 2019 17:57:27 +0300

Tom Honermann wrote:
> On 8/14/19 3:54 AM, Peter Dimov wrote:
> > Tom Honermann wrote:
> >
> >> I think we *might* be successful in using "execution encoding" to
> >> apply to both the compile-time and run-time encodings by extending the
> >> term with specific qualifiers; e.g., "presumed execution encoding" and
> >> "run-time/system/native execution encoding".
> >
> > This would be implying that there's a single "execution" or "native"
> > encoding, whereas there are many.
> >
> > - encoding used for character literals
>
> I made the "presumed execution encoding" distinction specifically for this
> case.

Right, and I am saying that calling all the encodings "<adjective> execution
encoding" implies that they are if not the same, then somehow related, and
they aren't.

I would call the encoding used for narrow character literals "narrow literal
encoding" and the encoding used for wide character literals "wide literal
encoding". This is what they are.

"Execution encoding" made sense when a program was, say, written in
Krasnoyarsk and intended to be executed in Kuala Lumpur. A Krasnoyarsk
machine used the Krasnoyarsk encoding for everything, and a Kuala Lumpur
machine used the Kuala Lumpur encoding for everything. Hence source and
execution.

...
> > (*) Here "none" (arbitrary NTBS not interpreted as characters by the FS)
> > is an option
>
> Except that, historically, some implementations apply locale
> transformations to *some* of the filesystem interfaces. For example,
> paths passed to std::fstream may be converted in a locale sensitive way
> while paths passed to std::wfstream may not be.

Right, some FSs accept arbitrary NTBSs and guarantee a perfect roundtrip,
and others do not and do not. But that's library. The point is that there is
no single system execution encoding. Different parts of the system use
different encodings.

Received on 2019-08-14 16:57:38