C++ Logo

SG16

Advanced search

Subject: Re: Agenda for the 2021-06-09 SG16 telecon
From: Tom Honermann (tom_at_[hidden])
Date: 2021-06-08 14:42:37


Reminder that this meeting is taking place tomorrow.

Updated wording is not yet available for P2295, so is dropped from the
agenda.  We will therefore focus exclusively on establishing consensus
for P2093R6.  As part of that effort, we'll discuss the newly created
LWG issue 3565 <https://wg21.link/lwg3565> that Victor requested
following our last telecon.  If we run out of time, we'll discuss that
issue next time.

I have the following candidate polls prepared.  Thank you to Peter for
his assistance in drafting these.  Please respond with suggested
refinements so that we can avoid spending telecon time addressing
technicalities.

These polls are intended to establish consensus regarding our response
to the questions presented below.

Many of the polls are applicable to std::format() as well as
std::print().  This is intentional.  The results may be used to guide
further papers or resolution of LWG issues.

*General polls:*

*Poll 1:* P2093R6: <format> and <print> facilities should have
consistent behavior with respect to encoding expectations for the format
string.

N.B. Whether encoding expectations exist may depend on other factors,
such as what encoding is used for the literal encoding.

*Poll 2:* P2093R6: <format> and <print> facilities should have
consistent behavior with respect to encoding expectations for the output
of formatters.

*Poll 3:* P2093R6: Regardless of format string encoding assumptions,
<format> facilities (but not <print> facilities) may be used to format
binary data.

N.B. the implementation does not inspect the result of a std::format()
invocation, but this is not necessarily true for std::print().

*How should invalidly encoded text be handled when transcoding for the
purpose of writing directly to a device interface?*

Encoding issues may be introduced by any of the following:

  * A format string that is not encoded as expected by the formatting
    facility.
  * Text provided by a formatter that is differently encoded with
    respect to the format string or other formatters.  This covers both
    standard and user provided formatters and contributions from locale
    dependent text.

*Poll 4:* P2093R6: <print> facilities exhibit undefined behavior when a
format string or formatter output does not match encoding expectations.

N.B. the poll is phrased so as to be independent of whether output is
directed to a device or not.

*Poll 5:* P2093R6: <print> facilities exhibit undefined behavior when a
format string or formatter output does not match encoding expectations
and output is directed to a device that has encoding expectations.

*Poll 6:* P2093R6: <print> facility implementors are encouraged to
provide a run-time means for diagnosing format strings and formatter
output that does not match encoding expectations.

*Poll 7:* P2093R6: <print> facility implementors are encouraged to
substitute U+FFFD replacement characters following Unicode guidance when
output is directed to a device and transcoding is necessary.

N.B. transcoding is not necessarily required; e.g., when encoding
expectations are for UTF-8 and the device interface expects UTF-8.

*Poll 8:* P2093R6: Neither <format> nor <print> facilities require an
explicit program-controlled error handling mechanism for violations of
encoding expectations.

N.B. such error handling mechanisms could be introduced in the future.

*
*
*Is use of UTF-8 as the literal encoding a sufficient indicator that all
input fed to std::format() and std::print() (including the format
string, programmer supplied field arguments, and locale provided text)
will be UTF-8 encoded?*

*Poll 9:* P2093R6: Use of UTF-8 as the literal encoding is sufficient
for <format> and <print> facilities to assume that the format string and
output of all formatters is UTF-8 encoded.

*Is the literal encoding a sufficient indicator in general that all
input fed to std::format() and std::print() (including the format
string, programmer supplied field arguments, and locale provided text)
will be provided in an encoding compatible with the literal encoding?*

*Poll 10:* P2093R6: Use of a literal encoding other than UTF-8 is
sufficient for <format> and <print> facilities to assume any particular
encoding for the format string and output of formatters.

*What are the implications for future support of std::print("{} {} {}
{}", L"Wide text", u8"UTF-8 text", u"UTF-16 text", U"UTF-32 text")?*
*
*
*Poll 11:* P2093R6: Support for implicit encoding conversions will only
be possible when an encoding assumption is implicitly or explicitly present.

N.B. a future paper could add the ability to pass an explicit encoding
tag; std::print(std::text_encoding::id::IBM1047, "{}", u8"hi").

*LWG 3565: Handling of encodings in localized formatting of
**chrono**types is underspecified*

*Poll 12:* LWG 3565: Adopt the proposed resolution as is.

Tom.

On 5/31/21 12:45 AM, Tom Honermann via SG16 wrote:
>
> SG16 will hold a telecon on Wednesday, June 9th at 19:30 UTC (timezone
> conversion
> <https://www.timeanddate.com/worldclock/converter.html?iso=20210609T193000&p1=1440&p2=tz_pdt&p3=tz_mdt&p4=tz_cdt&p5=tz_edt&p6=tz_cest>).
>
> The agenda is:
>
> * D2295R4: Support for UTF-8 as a portable source file encoding
> <https://isocpp.org/files/papers/D2295R4.pdf>
> o Review updated wording produced through collaboration between
> Corentin, Jens, and Hubert to resolve earlier feedback at
> https://lists.isocpp.org/sg16/2021/04/2353.php.
> * P2093R6: Formatted output <https://wg21.link/p2093r6>
> o Continue discussion and poll for consensus on answers to the
> following questions:
> 1. How should invalidly encoded text be handled when
> transcoding for the purpose of writing directly to a
> device interface?
> 2. Is use of UTF-8 as the literal encoding a sufficient
> indicator that all input fed to std::format() and
> std::print() (including the format string, programmer
> supplied field arguments, and locale provided text) will
> be UTF-8 encoded?
> 3. Is the literal encoding a sufficient indicator in general
> that all input fed to std::format() and std::print()
> (including the format string, programmer supplied field
> arguments, and locale provided text) will be provided in
> an encoding compatible with the literal encoding?
> 4. What are the implications for future support of
> std::print("{} {} {} {}", L"Wide text", u8"UTF-8 text",
> u"UTF-16 text", U"UTF-32 text")?
>
> Discussion of D2295R4 is contingent on updated wording being available.
>
> For P2093R6, I believe we have sufficiently discussed transcoding
> concerns (including concerns related to locale provided field
> arguments) to be able to answer the first question above with strong
> consensus.  I likewise suspect that further discussion on the third
> question is unnecessary and that we are reasonably well positioned to
> poll it.  We began discussion around the second question at the last
> telecon, but I feel that some more discussion is needed.  We haven't
> discussed question four at all, but I expect to arrive at a clearly
> objective answer for that one.
>
> I would like for us to complete discussion and polling for P2093
> during this telecon.  I don't know if that is realistic, but that is
> what we'll aim for.  I will reply to this email with a set of
> candidate polls in advance of the telecon with the hope that we'll be
> able to reduce time negotiating polls during the telecon itself.
>
> Tom.
>
>
>



SG16 list run by sg16-owner@lists.isocpp.org