sg16: Re: [SG16-Unicode] [isocpp-core] What is the proper term for the locale dependent run-time character set/encoding used for the character classification and conversion functions?

From: Steve Downey <sdowney_at_[hidden]>
Date: Tue, 13 Aug 2019 16:10:29 -0400

Getting back to the original question. I think execution character set and
execution encoding would refer to the encoding specified by the default
locale, the "C" locale. We do not change the execution encoding via calls
to setlocale(), we change the global default locale to a new locale.

Any name is going to be confusing. I think it's better to just get an
explicit definition to go together with the term. Something like that the
execution encoding is the same as the default character set associated with
the default "C" locale, and that it is IF NDR if the actual default
character set is different than the presumed translation from source
encoding to execution encoding, or if translation units with different
execution encodings are linked together. IF NDR because I don't see how it
could always be detected but it can quickly turn into ODR violations where
the same named object has different definitions.

On Tue, Aug 13, 2019 at 1:22 PM Corentin <corentin.jabot_at_[hidden]> wrote:

>
>
> On Tue, Aug 13, 2019, 7:08 PM Thiago Macieira <thiago_at_[hidden]> wrote:
>
>> On Tuesday, 13 August 2019 09:55:07 PDT Corentin wrote:
>> > (if anyone is thinking about that, I don't recommend it. You're going
>> to run
>> > into size limits: ICC at 512kB and MSVC at 256kB. Use something like
>> xxd -i
>> > to generate a brace-delimited array instead)
>> >
>> > Afaik that works if you use \x to escape every byte otherwise some
>> > implementation will mess with your data. Nothing is guaranteed to be
>> > passthrough otherwise
>>
>> That would be ideal, but the problem I had was the unavailability of
>> proper
>> tools to convert the input into a form that the C++ compiler could
>> consume. I
>> was trying to do with a simple concatenation of a header, data, and
>> footer.
>>
>> The end result is a shell script, a Perl script and a powershell script:
>> https://codereview.qt-project.org/c/qt/qtbase/+/263548
>
>
> Interesting ! std::embed could be useful there (we are going a bit off
> script). Some kind of raw bytes literals or an implementation that would
> optimize parsing arrays of literals such that it is as efficient at compile
> time as strings would also be nice.
>
>>
>> --
>> Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
>> Software Architect - Intel System Software Products
>>
>>
>>
>> _______________________________________________
> SG16 Unicode mailing list
> Unicode_at_[hidden]
> http://www.open-std.org/mailman/listinfo/unicode
>

Received on 2019-08-13 22:10:43