C++ Logo

SG16

Advanced search

Subject: Re: [SG16-Unicode] [isocpp-core] What is the proper term for the locale dependent run-time character set/encoding used for the character classification and conversion functions?
From: Corentin Jabot (corentinjabot_at_[hidden])
Date: 2019-08-14 01:49:11


On Wed, Aug 14, 2019, 4:46 AM Tony V E <tvaneerd_at_[hidden]> wrote:

>
>
> On Tue, Aug 13, 2019 at 8:57 AM Corentin Jabot <corentinjabot_at_[hidden]>
> wrote:
>
>>
>>
>> On Tue, 13 Aug 2019 at 14:52, Ville Voutilainen <
>> ville.voutilainen_at_[hidden]> wrote:
>>
>>> On Tue, 13 Aug 2019 at 15:35, Corentin Jabot via Core
>>> <core_at_[hidden]> wrote:
>>> >
>>> >
>>> > Chiming in with my favorite solution:> Forbid u8/u16/u32 literals in
>>> non unicode encoded files
>>>
>>> But presumably not the ones that look like u8"\U1234" ?
>>>
>>
>> Yes, there is no reason to disallow that as It can't be misinterpreted by
>> neither the compiler or people (and quite a lot of code would needlessly
>> break)
>>
>>
> I find your lack of faith in people's ability to misinterpret something
> disturbing.
> :-)
>

😁 (Challenging your mail client)

\Uxxxx is unambiguous.

u8"é" is ambiguous. Both people and the compiler may interpret that in a
variety of ways. Notably if I have utf-8 in that file, which I wrote on
Linux, but then the msvc compiler thinks it's windows 1252...
Mojibake.

People also seem to be confused

https://stackoverflow.com/questions/23471935/how-are-u8-literals-supposed-to-work

> --
> Be seeing you,
> Tony
>



SG16 list run by sg16-owner@lists.isocpp.org