Yeah, that’s what I meant.
My concern was with the “the scalar value of each source character shall be preserved” in the below
“A UTF-8 file is a source file encoded with the UTF-8 encoding scheme defined in ISO/IEC 10646. An implementation shall support UTF-8 files. If the source file is determined to be a UTF-8 file, it shall represent a well-formed sequence
of UTF-8 code units and the scalar value of each source character shall be preserved.”
My concern is that if you write a string literal with Unicode characters in it and the compiler converts them to GB18030 that’s not “preserving the scalar value” I don’t understand translation phases very well, so feel free to tell me that’s
somehow handled later on.
From: Peter Brett <pbrett@cadence.com>
Sent: Thursday, April 29, 2021 3:34 AM
To: Charlie Barto <Charles.Barto@microsoft.com>
Cc: Corentin <corentin.jabot@gmail.com>; sg16@lists.isocpp.org
Subject: RE: [SG16] P2295R3 Support for UTF-8 as a portable source file encoding
Hi Charlie,
I’m going to assume that:
In that case, no – as I understand it this wording does not affect the conformance of an implementation where the literal encoding is GB18030. Please could you clarify what it was about the phase 1 changes that caused
concern?
Thanks!
Peter
From: SG16 <sg16-bounces@lists.isocpp.org>
On Behalf Of Charlie Barto via SG16
Sent: 29 April 2021 09:54
To: sg16@lists.isocpp.org
Cc: Charlie Barto <Charles.Barto@microsoft.com>; Corentin <corentin.jabot@gmail.com>
Subject: Re: [SG16] P2295R3 Support for UTF-8 as a portable source file encoding
EXTERNAL MAIL
Does that first change to lex.phases make the case where source character set is utf8 and execution character set is some oddball encoding (like gb18030)
I'll formed non-conforming?
Get
Outlook for iOS
From: SG16 <sg16-bounces@lists.isocpp.org> on behalf of Corentin via SG16 <sg16@lists.isocpp.org>
Sent: Thursday, April 29, 2021 12:34:35 AM
To: SG16 <sg16@lists.isocpp.org>
Cc: Corentin <corentin.jabot@gmail.com>
Subject: [SG16] P2295R3 Support for UTF-8 as a portable source file encoding
Per request in yesterday's meeting,
here is P2295R3 Support for UTF-8 as a portable source file encoding.
I am looking forward to your feedback