On 7/27/21 6:34 PM, Hubert Tong wrote:
On Mon, Jul 26, 2021 at 11:44 AM Tom Honermann via SG16 <sg16@lists.isocpp.org> wrote:

SG16 approved forwarding a draft of P2295R5 (Support for UTF-8 as a portable source file encoding) and P2362R0 (Make obfuscating wide character literals ill-formed) with minor modifications to EWG during its July 14th telecon.  All requested SG16 changes are present in the published versions of P2295R5 and P2362R1 that appear in the most recent mailing (note that P2362R1 sports a new title).

These papers are now ready for review by EWG and the Github issue tracker has been updated accordingly.  Both papers have wording that has been reviewed by a core expert and each reflects existing implementation practice.

I will note that P2295's treatment of end-of-line indicators for UTF-8 source files has not yet been implemented (to my knowledge) on platforms where text files traditionally have "out-of-band" line length information. I am not aware of technical limitations that prevent having a convention that works in the manner P2295 indicates, so this comment is for information only.

Thank you for that correction, Hubert.

Is there a de-facto standard convention for how text files that originate on other platforms are translated to such an environment?  For example, are new-line sequences in the original file removed in favor of such out-of-band information?  Or are they typically preserved?  If preserved, I imagine they may not correlate with the out-of-band line information.  Are there multiple new-line sequence forms in practice?

I'm asking because I would like to better understand the impact to programmers.  Given a UTF-8 encoded file on another platform, in practice, are there multiple ways in which such a file might be translated for this environment?  If so, is there a dominant representation?


P2295 has also been reviewed by SG22 (C/C++ Liaison) and has not been tagged for review by any other SGs.  P2362 still awaits SG22 review, so I encourage the EWG and SG22 chairs to coordinate to determine if EWG review should await SG22's review.

Thank you to both authors for the time and patience they exhibited throughout the reviews of these papers; particularly with regard to finding wording for P2295.


SG16 mailing list