C++ Logo

sg16

Advanced search

Re: [SG16] Reminder: SG16 telecon tomorrow, Wednesday, 2020-09-09

From: Corentin Jabot <corentinjabot_at_[hidden]>
Date: Tue, 8 Sep 2020 19:27:46 +0200
On Tue, 8 Sep 2020 at 17:18, Tom Honermann via SG16 <sg16_at_[hidden]>
wrote:

> This is your friendly reminder that an SG16 telecon will be held tomorrow,
> Wednesday September 9th, at 19:30 UTC (timezone conversion
> <https://www.timeanddate.com/worldclock/converter.html?iso=20200909T193000&p1=1440&p2=tz_pt&p3=tz_mt&p4=tz_ct&p5=tz_et&p6=tz_cest>
> ).
>
> This meeting will be conducted via Zoom. To attend, visit
> https://iso.zoom.us/j/8414530059 at the start of the meeting. Please
> contact me privately if necessary for the meeting password.
>
> The agenda is:
>
> - P2178R1: Misc lexing and string handling improvements
> <https://wg21.link/p2178r1>
> - Discuss proposal 1: Mandating support for UTF-8 encoded source
> files in phase 1
> - P2194R0: The character set of C++ source code is Unicode
> <https://isocpp.org/files/papers/P2194R0.pdf>
>
> For the UTF-8 discussion, please take some time ahead of the meeting to
> consider the following concerns:
>
> - Migration strategies for non-UTF-8 projects to transition to UTF-8,
> possibly incrementally.
>
> We are not suggesting a change that would require any migration to any
existing code

>
> - Migration strategies for implementors to transition system headers
> to UTF-8, possibly incrementally.
>
> Ditto

>
> - Support for differently encoded source files within a single
> translation unit.
>
> P2178R1 is not proposing anything that would relate to that

>
> - Support for differently encoded primary source file within a single
> project.
>
> P2178R1 would have no bearing on that (notably because the internal
representation is not observable by the program)

>
> - Error handling for ill-formed UTF-8 sequences in each of:
> - Comments
> - String literals
> - Elsewhere.
>
> P2178R1 proposals 1 stand clear of that issue but it is definitively worth
discussing

>
> - Handling of BOMs.
>
>
> - Whether an in-source encoding annotation is needed and what form is
> should take:
> - A magic comment (like Python)
> - A pragma directive (like xlC)
>
> This is an orthogonal concern (one which Tom is working on, and which is
worth discussing but falls out of scope of P2178)

Current status (C++20): the encoding of the source file is picked in an
implementation defined manner from an implementation defined set of encoding

Proposed by P2178: The encoding of the source file is picked in an
implementation defined manner from an implementation defined set of
encoding which includes UTF-8. No further restriction is added to the set
of implementation defined encodings or the implementation defined mechanism
determining the encoding. There just must be _some_ documented way for the
compiler to interpret files as UTF-8.
That's it.



>
>
> A very rough draft of a paper discussing these concerns is available at
> https://rawgit.com/tahonermann/sg16/master/papers/dyyyyr0-utf-8-source-files.html.
> We will **not** discuss this paper at this meeting, but the Existing
> Practice section
> <https://rawgit.com/tahonermann/sg16/master/papers/dyyyyr0-utf-8-source-files.html#existing_practice>
> may be informative (please ignore the rest of the draft for now).
>
> No decisions will be made at this meeting, but direction polls are
> expected.
> Tom.
> --
> SG16 mailing list
> SG16_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>

Received on 2020-09-08 12:31:28