C++ Logo

sg16

Advanced search

Re: [SG16-Unicode] [isocpp-core] What is the proper term for the locale dependent run-time character set/encoding used for the character classification and conversion functions?

From: Tom Honermann <tom_at_[hidden]>
Date: Tue, 13 Aug 2019 12:07:48 -0400
> On Aug 13, 2019, at 10:20 AM, Niall Douglas via Core <core_at_[hidden]> wrote:
>
>> On 13/08/2019 09:38, Niall Douglas via Core wrote:
>> Before progressing with a solution, can I ask the question:
>>
>> Is it politically feasible for C++ 23 and C 2x to require
>> implementations to default to interpreting source files as either (i) 7
>> bit ASCII or (ii) UTF-8? To be specific, char literals would thus be
>> either 7 bit ASCII or UTF-8.
>
> I see that nobody has said no to this proposal yet. Yes I agree with
> Corentin that escaped characters within literals are fine, you don't
> even need a UTF library in the compiler for those, so small C compiler
> folk won't complain.
>
> If nobody from WG21 objects to this proposal, shall I go ask WG14?

I object, but don’t have time to respond further right now. There are existing implementations where, by default, source files are assumed to be encoded with some EBCDIC code page. I don’t want to break those implementations, nor impose the significant burden such a change would place on users of those implementations.

This is (another) tangent to the original question. Source file encoding has nothing to do with execution encoding.

Tom.
>
> Because if they also don't object, then there is green grass ahoy.
>
> Niall
> _______________________________________________
> Core mailing list
> Core_at_[hidden]
> Subscription: https://lists.isocpp.org/mailman/listinfo.cgi/core
> Link to this post: http://lists.isocpp.org/core/2019/08/7037.php

Received on 2019-08-13 18:07:52