C++ Logo

sg16

Advanced search

Re: [SG16-Unicode] Draft: char8_t backward compatibility remediation paper

From: Tom Honermann <tom_at_[hidden]>
Date: Wed, 5 Dec 2018 22:19:09 -0500
On 12/5/18 8:31 PM, Markus Scherer wrote:
> On Wed, Dec 5, 2018 at 3:34 PM Steve Downey <sdowney_at_[hidden]
> <mailto:sdowney_at_[hidden]>> wrote:
>
> How many contain text that is not already UTF-8?
>
>
> I am not sure what you are asking. Most of the u8"literals" I am
> seeing contain non-ASCII characters. Many as literal characters, a
> bunch of \uhhhh, and a few \U00hhhhhh.

I was likewise uncertain about this question.

Steve, I'm guessing the question you're trying to get at is, would there
be any behavioral difference if the u8 prefix was simply dropped? I
think this is equivalent to asking the question, are the source files
for these examples encoded as UTF-8 and is the compiler invoked such
that the source encoding and presumed execution encoding are both UTF-8
(always the case for Clang, the default for gcc unless -finput-charset
or -fexec-charset is used, and not the case for MSVC unless /utf-8 is used).

Tom.


Received on 2018-12-06 04:26:48