Yeah, that’s what I meant.

 

My concern was with the “the scalar value of each source character shall be preserved” in the below

 

“A UTF-8 file is a source file encoded with the UTF-8 encoding scheme defined in ISO/IEC 10646. An implementation shall support UTF-8 files. If the source file is determined to be a UTF-8 file, it shall represent a well-formed sequence of UTF-8 code units and the scalar value of each source character shall be preserved.”

 

My concern is that if you write a string literal with Unicode characters in it and the compiler converts them to GB18030 that’s not “preserving the scalar value” I don’t understand translation phases very well, so feel free to tell me that’s somehow handled later on.

 

From: Peter Brett <pbrett@cadence.com>
Sent: Thursday, April 29, 2021 3:34 AM
To: Charlie Barto <Charles.Barto@microsoft.com>
Cc: Corentin <corentin.jabot@gmail.com>; sg16@lists.isocpp.org
Subject: RE: [SG16] P2295R3 Support for UTF-8 as a portable source file encoding

 

Hi Charlie,

 

I’m going to assume that:

 

 

In that case, no – as I understand it this wording does not affect the conformance of an implementation where the literal encoding is GB18030.  Please could you clarify what it was about the phase 1 changes that caused concern?

 

Thanks!

 

              Peter

 

From: SG16 <sg16-bounces@lists.isocpp.org> On Behalf Of Charlie Barto via SG16
Sent: 29 April 2021 09:54
To: sg16@lists.isocpp.org
Cc: Charlie Barto <Charles.Barto@microsoft.com>; Corentin <corentin.jabot@gmail.com>
Subject: Re: [SG16] P2295R3 Support for UTF-8 as a portable source file encoding

 

EXTERNAL MAIL

Does that first change to lex.phases make the case where source character set is utf8 and execution character set is some oddball encoding (like gb18030) I'll formed non-conforming?

 

Get Outlook for iOS


From: SG16 <sg16-bounces@lists.isocpp.org> on behalf of Corentin via SG16 <sg16@lists.isocpp.org>
Sent: Thursday, April 29, 2021 12:34:35 AM
To: SG16 <sg16@lists.isocpp.org>
Cc: Corentin <corentin.jabot@gmail.com>
Subject: [SG16] P2295R3 Support for UTF-8 as a portable source file encoding

 

Per request in yesterday's meeting, 

here is P2295R3 Support for UTF-8 as a portable source file encoding.

 

I am looking forward to your feedback

 

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2295r3.pdf