C++ Logo

SG16

Advanced search

Subject: Re: [SG16-Unicode] [isocpp-core] What is the proper term for the locale dependent run-time character set/encoding used for the character classification and conversion functions?
From: Tom Honermann (tom_at_[hidden])
Date: 2019-08-15 07:34:30


On 8/15/19 7:12 AM, Steve Downey wrote:
> Execution encoding is a term we use in conversation, it's not actually
> a term in the standard. The standard speaks of execution character
> sets, the values of which are determined by locale. Which locale is
> not specified.

Indeed.  I just can't bring myself to use "character set" when the
context calls for "encoding".  This is something else I'd like to clean
up in the standard.

Tom.

>
> On Wed, Aug 14, 2019, 23:21 Tom Honermann via Core
> <core_at_[hidden] <mailto:core_at_[hidden]>> wrote:
>
> On 8/14/19 10:57 AM, Peter Dimov wrote:
> > Tom Honermann wrote:
> >> On 8/14/19 3:54 AM, Peter Dimov wrote:
> >>> Tom Honermann wrote:
> >>>
> >>>>   I think we *might* be successful in using "execution
> encoding" to
> >>>> apply to both the compile-time and run-time encodings by
> extending the
> >>>> term with specific qualifiers; e.g., "presumed execution
> encoding" and
> >>>> "run-time/system/native execution encoding".
> >>> This would be implying that there's a single "execution" or
> "native"
> >>> encoding, whereas there are many.
> >>>
> >>> - encoding used for character literals
> >> I made the "presumed execution encoding" distinction
> specifically for this
> >> case.
> > Right, and I am saying that calling all the encodings
> "<adjective> execution
> > encoding" implies that they are if not the same, then somehow
> related, and
> > they aren't.
> Ok, that is a fair critique.
> >
> > I would call the encoding used for narrow character literals
> "narrow literal
> > encoding" and the encoding used for wide character literals
> "wide literal
> > encoding". This is what they are.
>
> I feel some reluctance to changing a term that has been around for so
> long, and this strikes me as too specific.  There are other
> constructs
> that are also encoded according to the (presumed) execution encoding.
> For example source locations exposed via the __FILE__ macro, function
> names exposed via __func__, etc..
>
> We don't know at compile-time how encoded literals will be used at
> run-time.  They may be passed to the locale sensitive character
> conversion functions, used as filenames, written to a terminal,
> etc...
> All of these encodings are not known until run-time.  I kind of
> like the
> use of "presumed execution encoding" as indicating a compatible
> subset
> of all of the encodings used at run-time.
>
> >
> > "Execution encoding" made sense when a program was, say, written in
> > Krasnoyarsk and intended to be executed in Kuala Lumpur. A
> Krasnoyarsk
> > machine used the Krasnoyarsk encoding for everything, and a
> Kuala Lumpur
> > machine used the Kuala Lumpur encoding for everything. Hence
> source and
> > execution.
>
> It still very much makes sense when cross-compiling today.
>
> Tom.
>
> _______________________________________________
> Core mailing list
> Core_at_[hidden] <mailto:Core_at_[hidden]>
> Subscription: https://lists.isocpp.org/mailman/listinfo.cgi/core
> Link to this post: http://lists.isocpp.org/core/2019/08/7062.php
>



SG16 list run by sg16-owner@lists.isocpp.org