C++ Logo


Advanced search

Re: [SG16] Reminder: SG16 telecon tomorrow, Wednesday, 2020-09-09

From: Tom Honermann <tom_at_[hidden]>
Date: Wed, 9 Sep 2020 00:22:25 -0400
On 9/8/20 11:12 PM, Hubert Tong wrote:
> On Tue, Sep 8, 2020 at 1:57 PM Peter Brett via SG16
> <sg16_at_[hidden] <mailto:sg16_at_[hidden]>> wrote:
> Hi Tom,
> Having reviewed the paper, I’m struggling to understand how most
> of those concerns are pertinent to discussing P2178R1 proposal 1.
> This is possibly about being clear in terms of proactively dealing
> with assumptions that non-committee members may have when they hear
> the "UTF-8" source is "completely okay" for C++. Tom's questions would
> point out all sorts of caveats. For example, the compiler might say
> that UTF-8 source is supported with a flag that causes all file
> processing for that invocation to need UTF-8 source. This is going to
> cause problems for header inclusion.
> There's a motivation in terms of user benefit for requiring support to
> consume UTF-8 encoded source. These questions are pertinent to
> ensuring that the benefits are actually realized.

Hubert's perception is exactly right. I'm raising these questions
because I believe more analysis is needed before we proceed in any
specific direction. Proposal 1 lacks sufficient analysis to inform
direction other than to say, "we want to support UTF-8". That is ok;
proposal 1 clearly wasn't intended as a final proposal as written. It
is clear to me that we have consensus for support of UTF-8, but there
are devils lurking in the details that we have yet to exorcise. The
discussion tomorrow will focus on those devils.

> I’m not adverse to talking about them, because they are important
> and need to be addressed at some point, but it feels like giving
> them the attention that they deserve would not leave time for
> discussing P2194R0.
That is possible. I am ok with not getting to P2194 tomorrow, or only
getting a start on it, if the UTF-8 discussion is productive. If it
becomes unproductive, we'll switch.
> Please could we consider scheduling a discussion of these points
> for another meeting when your draft paper is ready to discuss in
> detail?
My goal is that the discussion will help to inform further development
of that draft paper and to attract collaborators. Just as P2194 is the
evolution of P2178 proposal 9, I hope that draft will become the
evolution of proposal 1. That implies that it must reflect consensus
opinions as well as dissenting ones and offer a choice of options to
present to EWG.


> Many thanks,
> Peter
> *From:*SG16 <sg16-bounces_at_[hidden]
> <mailto:sg16-bounces_at_[hidden]>> *On Behalf Of *Tom
> Honermann via SG16
> *Sent:* 08 September 2020 16:19
> *To:* SG16 <sg16_at_[hidden] <mailto:sg16_at_[hidden]>>
> *Cc:* Tom Honermann <tom_at_[hidden] <mailto:tom_at_[hidden]>>
> *Subject:* [SG16] Reminder: SG16 telecon tomorrow, Wednesday,
> 2020-09-09
> This is your friendly reminder that an SG16 telecon will be held
> tomorrow, Wednesday September 9th, at 19:30 UTC (timezone
> conversion
> <https://urldefense.com/v3/__https:/www.timeanddate.com/worldclock/converter.html?iso=20200909T193000&p1=1440&p2=tz_pt&p3=tz_mt&p4=tz_ct&p5=tz_et&p6=tz_cest__;!!EHscmS1ygiU1lA!Vfw3qzgTZUwraEv8xAXWvBzSxchgDvQcGbwIDvp5Wa1U32pySPahS-shBtU-Tg$>).
> This meeting will be conducted via Zoom. To attend, visit
> https://iso.zoom.us/j/8414530059
> <https://urldefense.com/v3/__https:/iso.zoom.us/j/8414530059__;!!EHscmS1ygiU1lA!Vfw3qzgTZUwraEv8xAXWvBzSxchgDvQcGbwIDvp5Wa1U32pySPahS-vIbB-Szg$>
> at the start of the meeting. Please contact me privately if
> necessary for the meeting password.
> The agenda is:
> * P2178R1: Misc lexing and string handling improvements
> <https://urldefense.com/v3/__https:/wg21.link/p2178r1__;!!EHscmS1ygiU1lA!Vfw3qzgTZUwraEv8xAXWvBzSxchgDvQcGbwIDvp5Wa1U32pySPahS-sl3WqX_w$>
> o Discuss proposal 1: Mandating support for UTF-8 encoded
> source files in phase 1
> * P2194R0: The character set of C++ source code is Unicode
> <https://urldefense.com/v3/__https:/isocpp.org/files/papers/P2194R0.pdf__;!!EHscmS1ygiU1lA!Vfw3qzgTZUwraEv8xAXWvBzSxchgDvQcGbwIDvp5Wa1U32pySPahS-u_eVy6ew$>
> For the UTF-8 discussion, please take some time ahead of the
> meeting to consider the following concerns:
> * Migration strategies for non-UTF-8 projects to transition to
> UTF-8, possibly incrementally.
> * Migration strategies for implementors to transition system
> headers to UTF-8, possibly incrementally.
> * Support for differently encoded source files within a single
> translation unit.
> * Support for differently encoded primary source file within a
> single project.
> * Error handling for ill-formed UTF-8 sequences in each of:
> o Comments
> o String literals
> o Elsewhere.
> * Handling of BOMs.
> * Whether an in-source encoding annotation is needed and what
> form is should take:
> o A magic comment (like Python)
> o A pragma directive (like xlC)
> A very rough draft of a paper discussing these concerns is
> available at
> https://rawgit.com/tahonermann/sg16/master/papers/dyyyyr0-utf-8-source-files.html
> <https://urldefense.com/v3/__https:/rawgit.com/tahonermann/sg16/master/papers/dyyyyr0-utf-8-source-files.html__;!!EHscmS1ygiU1lA!Vfw3qzgTZUwraEv8xAXWvBzSxchgDvQcGbwIDvp5Wa1U32pySPahS-t3dhLy9Q$>.
> We will **not** discuss this paper at this meeting, but the
> Existing Practice section
> <https://urldefense.com/v3/__https:/rawgit.com/tahonermann/sg16/master/papers/dyyyyr0-utf-8-source-files.html*existing_practice__;Iw!!EHscmS1ygiU1lA!Vfw3qzgTZUwraEv8xAXWvBzSxchgDvQcGbwIDvp5Wa1U32pySPahS-v3A95_Ow$>
> may be informative (please ignore the rest of the draft for now).
> No decisions will be made at this meeting, but direction polls are
> expected.
> Tom.
> --
> SG16 mailing list
> SG16_at_[hidden] <mailto:SG16_at_[hidden]>
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16

Received on 2020-09-08 23:25:57