Date: Sat, 26 Jun 2021 07:27:34 -0700
AFAIK the transcoding is unavoidable in this case because even if we don't
do it in the library, Windows itself will do it internally. Also there is
so much text you can meaningfully print to a terminal without overwhelming
a user.
With P2216 the format string is a compile-time string which is either a
literal or something derived from literals and such.
- Victor
On Wed, Jun 23, 2021 at 4:33 PM Steve Downey <sdowney_at_[hidden]> wrote:
> IIUC the major case of concern for transcoding is the case where print to
> win console has a mandatory transcode from something to UTF-16.
> Handling misencoded UTF-16 is much more expensive than in 8, where it's
> almost free. I'm not sure concerns about cost are warrented.
>
> Also, for the print and literal encoding issue, can we actually tell if
> it's a literal? Or do we really mean the associated encoding of the char
> type? Is this just for the compile time machinery?
>
>
> On Wed, Jun 23, 2021, 15:14 Victor Zverovich via SG16 <
> sg16_at_[hidden]> wrote:
>
>> Also I think that poll 2.1 is ill-formed. std::format doesn't and cannot
>> have any encoding expectations, it's individual formatters that may or may
>> not have such expectations. For example, string formatters currently have
>> such expectations which manifests in Unicode-based width estimation. At the
>> same time {fmt} has a bytes formatter for binary data which doesn't do such
>> estimation. Such a formatter can be easily written for C++20 std::format as
>> well.
>>
>> - Victor
>>
>> On Wed, Jun 23, 2021 at 11:57 AM Victor Zverovich <
>> victor.zverovich_at_[hidden]> wrote:
>>
>>> The apparent conflict comes from conflating file and terminal output.
>>> While it makes a lot of sense to use print for writing binary data to files
>>> it makes much less sense to do the same for a terminal. In any case if
>>> there is an escaping mechanism, that should resolve it.
>>>
>>> - Victor
>>>
>>> On Tue, Jun 22, 2021 at 11:29 AM Tom Honermann via SG16 <
>>> sg16_at_[hidden]> wrote:
>>>
>>>> Reminder that this meeting is taking place tomorrow.
>>>>
>>>> Once we complete the remaining design polls, I'd like to clarify what
>>>> might be perceived as a conflict in our poll results for poll 3.2 vs polls
>>>> 4 and 5; we can't state both that binary data may be formatted and that
>>>> formatter output that doesn't match an expected encoding is UB. Hubert's
>>>> suggestion of an escape mechanism would suffice to resolve the apparent
>>>> conflict. Are there other ways to interpret these polls? Or other
>>>> solutions to resolve the apparent conflict?
>>>>
>>>> Tom.
>>>>
>>>> On 6/17/21 11:56 AM, Tom Honermann via SG16 wrote:
>>>>
>>>> SG16 will hold a telecon on Wednesday, June 23rd at 19:30 UTC (timezone
>>>> conversion
>>>> <https://www.timeanddate.com/worldclock/converter.html?iso=20210623T193000&p1=1440&p2=tz_pdt&p3=tz_mdt&p4=tz_cdt&p5=tz_edt&p6=tz_cest>
>>>> ).
>>>>
>>>> The agenda is:
>>>>
>>>> - P2093R6: Formatted output <https://wg21.link/p2093r6>
>>>> - Finish polling begun at the last telecon.
>>>> - LWG 3565: Handling of encodings in localized formatting of chrono
>>>> types is underspecified <https://cplusplus.github.io/LWG/issue3565>
>>>> - Discuss and poll the proposed resolution.
>>>> - P2295R4: Support for UTF-8 as a portable source file encoding
>>>> <https://wg21.link/p2295r4>
>>>> - Review updated wording produced through collaboration between
>>>> Corentin, Jens, Hubert, and Peter.
>>>> - https://lists.isocpp.org/sg16/2021/04/2353.php
>>>> - https://lists.isocpp.org/sg16/2021/06/2429.php
>>>>
>>>> At the last telecon, we discussed addressing LWG 3565 as the first
>>>> agenda item for this telecon. However, I would prefer to finish polling
>>>> for P2093R6 first as I expect some of the remaining candidate polls to be
>>>> potentially relevant for the LWG issue resolution.
>>>>
>>>> For reference, here are the P2093R6 polls and poll results taken during
>>>> the last telecon (I'll get the meeting summary posted soon). Consensus so
>>>> far appears to be rather strong with the exception of poll 3.2.
>>>>
>>>> - *Poll 1: P2093R6: <format> and <print> facilities should have
>>>> consistent behavior with respect to encoding expectations for the format
>>>> string.*
>>>> Attendees: 8
>>>> No objection to unanimous consent.
>>>> - *Poll 2.1: P2093R6: <format> and <print> facilities should have
>>>> consistent behavior with respect to encoding expectations for the output of
>>>> formatters.*
>>>> <Not polled; per discussion, revisit following later polls>
>>>> - *Poll 2.2: P2093R6: formatters should not be sensitive to whether
>>>> they are being used with a <format> or <print> facility.*
>>>> Attendees: 8
>>>> No objection to unanimous consent.
>>>> - *Poll 3.1: P2093R6: Regardless of format string encoding
>>>> assumptions, <format> facilities may be used to format binary data.*
>>>> Attendees: 8 (1 abstention)
>>>> SF F N A SA
>>>> 5 1 1 0 0
>>>> Strong consensus
>>>> - *Poll 3.2: P2093R6: Regardless of format string encoding
>>>> assumptions, <print> facilities may be used to format binary data.*
>>>> Attendees: 8 (1 abstention)
>>>> SF F N A SA
>>>> 2 1 3 1 0
>>>> Weak consensus
>>>> - *Poll 4: P2093R6: <print> facilities exhibit undefined behavior
>>>> when an encoding expectation is present and a format string or formatter
>>>> output does not match those expectations.*
>>>> Attendees: 8 (1 abstention)
>>>> SF F N A SA
>>>> 2 4 0 0 1
>>>> Strong consensus
>>>> - *Poll 5: P2093R6: <print> facilities exhibit undefined behavior
>>>> when an encoding expectation is present and a format string or formatter
>>>> output does not match those expectations and output is directed to a device
>>>> that has encoding expectations.*
>>>> Attendees: 8 (1 abstention)
>>>> SF F N A SA
>>>> 6 0 1 0 0
>>>> Stronger consensus than poll 4.
>>>> - *Poll 6: P2093R6: <print> facility implementors are encouraged to
>>>> provide a run-time means for diagnosing format strings and formatter output
>>>> that is not well-formed according to the expected encoding.*
>>>> Attendees: 8 (1 abstention)
>>>> SF F N A SA
>>>> 4 0 2 1 0
>>>> Consensus.
>>>>
>>>> The remaining candidate polls are:
>>>>
>>>> - Poll 2.1: P2093R6: <format> and <print> facilities should have
>>>> consistent behavior with respect to encoding expectations for the output of
>>>> formatters.
>>>> - Poll 7: P2093R6: <print> facility implementors are encouraged to
>>>> substitute U+FFFD replacement characters following Unicode guidance when
>>>> output is directed to a device and transcoding is necessary.
>>>> - Poll 8: P2093R6: Neither <format> nor <print> facilities require
>>>> an explicit program-controlled error handling mechanism for violations of
>>>> encoding expectations.
>>>> - Poll 9: P2093R6: Use of UTF-8 as the literal encoding is
>>>> sufficient for <format> and <print> facilities to assume that the format
>>>> string and output of all formatters is UTF-8 encoded.
>>>> - Poll 10: P2093R6: Use of a literal encoding other than UTF-8 is
>>>> sufficient for <format> and <print> facilities to assume a particular
>>>> encoding for the format string and output of formatters.
>>>> - Poll 11: P2093R6: Support for implicit encoding conversions will
>>>> only be possible when an encoding assumption is implicitly or explicitly
>>>> present.
>>>>
>>>> Assuming good consensus on those polls, we'll poll forwarding P2093R6
>>>> to LEWG again with direction to revise the paper to align with SG16
>>>> feedback. At a minimum, a revision will be expected to record SG16
>>>> direction and rationale. In order to avoid spending more SG16 telecon time
>>>> on this paper, we'll look for a volunteer to review the updated revision
>>>> and report back to SG16.
>>>>
>>>> - Poll X: P02093R6: Direct Victor to revise the paper to reflect
>>>> SG16 rationale and guidance, delegate review of a future revision to XXX,
>>>> and forward to LEWG for inclusion in C++23 pending review confirmation.
>>>>
>>>> Tom.
>>>>
>>>>
>>>> --
>>>> SG16 mailing list
>>>> SG16_at_[hidden]
>>>> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>>>>
>>> --
>> SG16 mailing list
>> SG16_at_[hidden]
>> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>>
>
do it in the library, Windows itself will do it internally. Also there is
so much text you can meaningfully print to a terminal without overwhelming
a user.
With P2216 the format string is a compile-time string which is either a
literal or something derived from literals and such.
- Victor
On Wed, Jun 23, 2021 at 4:33 PM Steve Downey <sdowney_at_[hidden]> wrote:
> IIUC the major case of concern for transcoding is the case where print to
> win console has a mandatory transcode from something to UTF-16.
> Handling misencoded UTF-16 is much more expensive than in 8, where it's
> almost free. I'm not sure concerns about cost are warrented.
>
> Also, for the print and literal encoding issue, can we actually tell if
> it's a literal? Or do we really mean the associated encoding of the char
> type? Is this just for the compile time machinery?
>
>
> On Wed, Jun 23, 2021, 15:14 Victor Zverovich via SG16 <
> sg16_at_[hidden]> wrote:
>
>> Also I think that poll 2.1 is ill-formed. std::format doesn't and cannot
>> have any encoding expectations, it's individual formatters that may or may
>> not have such expectations. For example, string formatters currently have
>> such expectations which manifests in Unicode-based width estimation. At the
>> same time {fmt} has a bytes formatter for binary data which doesn't do such
>> estimation. Such a formatter can be easily written for C++20 std::format as
>> well.
>>
>> - Victor
>>
>> On Wed, Jun 23, 2021 at 11:57 AM Victor Zverovich <
>> victor.zverovich_at_[hidden]> wrote:
>>
>>> The apparent conflict comes from conflating file and terminal output.
>>> While it makes a lot of sense to use print for writing binary data to files
>>> it makes much less sense to do the same for a terminal. In any case if
>>> there is an escaping mechanism, that should resolve it.
>>>
>>> - Victor
>>>
>>> On Tue, Jun 22, 2021 at 11:29 AM Tom Honermann via SG16 <
>>> sg16_at_[hidden]> wrote:
>>>
>>>> Reminder that this meeting is taking place tomorrow.
>>>>
>>>> Once we complete the remaining design polls, I'd like to clarify what
>>>> might be perceived as a conflict in our poll results for poll 3.2 vs polls
>>>> 4 and 5; we can't state both that binary data may be formatted and that
>>>> formatter output that doesn't match an expected encoding is UB. Hubert's
>>>> suggestion of an escape mechanism would suffice to resolve the apparent
>>>> conflict. Are there other ways to interpret these polls? Or other
>>>> solutions to resolve the apparent conflict?
>>>>
>>>> Tom.
>>>>
>>>> On 6/17/21 11:56 AM, Tom Honermann via SG16 wrote:
>>>>
>>>> SG16 will hold a telecon on Wednesday, June 23rd at 19:30 UTC (timezone
>>>> conversion
>>>> <https://www.timeanddate.com/worldclock/converter.html?iso=20210623T193000&p1=1440&p2=tz_pdt&p3=tz_mdt&p4=tz_cdt&p5=tz_edt&p6=tz_cest>
>>>> ).
>>>>
>>>> The agenda is:
>>>>
>>>> - P2093R6: Formatted output <https://wg21.link/p2093r6>
>>>> - Finish polling begun at the last telecon.
>>>> - LWG 3565: Handling of encodings in localized formatting of chrono
>>>> types is underspecified <https://cplusplus.github.io/LWG/issue3565>
>>>> - Discuss and poll the proposed resolution.
>>>> - P2295R4: Support for UTF-8 as a portable source file encoding
>>>> <https://wg21.link/p2295r4>
>>>> - Review updated wording produced through collaboration between
>>>> Corentin, Jens, Hubert, and Peter.
>>>> - https://lists.isocpp.org/sg16/2021/04/2353.php
>>>> - https://lists.isocpp.org/sg16/2021/06/2429.php
>>>>
>>>> At the last telecon, we discussed addressing LWG 3565 as the first
>>>> agenda item for this telecon. However, I would prefer to finish polling
>>>> for P2093R6 first as I expect some of the remaining candidate polls to be
>>>> potentially relevant for the LWG issue resolution.
>>>>
>>>> For reference, here are the P2093R6 polls and poll results taken during
>>>> the last telecon (I'll get the meeting summary posted soon). Consensus so
>>>> far appears to be rather strong with the exception of poll 3.2.
>>>>
>>>> - *Poll 1: P2093R6: <format> and <print> facilities should have
>>>> consistent behavior with respect to encoding expectations for the format
>>>> string.*
>>>> Attendees: 8
>>>> No objection to unanimous consent.
>>>> - *Poll 2.1: P2093R6: <format> and <print> facilities should have
>>>> consistent behavior with respect to encoding expectations for the output of
>>>> formatters.*
>>>> <Not polled; per discussion, revisit following later polls>
>>>> - *Poll 2.2: P2093R6: formatters should not be sensitive to whether
>>>> they are being used with a <format> or <print> facility.*
>>>> Attendees: 8
>>>> No objection to unanimous consent.
>>>> - *Poll 3.1: P2093R6: Regardless of format string encoding
>>>> assumptions, <format> facilities may be used to format binary data.*
>>>> Attendees: 8 (1 abstention)
>>>> SF F N A SA
>>>> 5 1 1 0 0
>>>> Strong consensus
>>>> - *Poll 3.2: P2093R6: Regardless of format string encoding
>>>> assumptions, <print> facilities may be used to format binary data.*
>>>> Attendees: 8 (1 abstention)
>>>> SF F N A SA
>>>> 2 1 3 1 0
>>>> Weak consensus
>>>> - *Poll 4: P2093R6: <print> facilities exhibit undefined behavior
>>>> when an encoding expectation is present and a format string or formatter
>>>> output does not match those expectations.*
>>>> Attendees: 8 (1 abstention)
>>>> SF F N A SA
>>>> 2 4 0 0 1
>>>> Strong consensus
>>>> - *Poll 5: P2093R6: <print> facilities exhibit undefined behavior
>>>> when an encoding expectation is present and a format string or formatter
>>>> output does not match those expectations and output is directed to a device
>>>> that has encoding expectations.*
>>>> Attendees: 8 (1 abstention)
>>>> SF F N A SA
>>>> 6 0 1 0 0
>>>> Stronger consensus than poll 4.
>>>> - *Poll 6: P2093R6: <print> facility implementors are encouraged to
>>>> provide a run-time means for diagnosing format strings and formatter output
>>>> that is not well-formed according to the expected encoding.*
>>>> Attendees: 8 (1 abstention)
>>>> SF F N A SA
>>>> 4 0 2 1 0
>>>> Consensus.
>>>>
>>>> The remaining candidate polls are:
>>>>
>>>> - Poll 2.1: P2093R6: <format> and <print> facilities should have
>>>> consistent behavior with respect to encoding expectations for the output of
>>>> formatters.
>>>> - Poll 7: P2093R6: <print> facility implementors are encouraged to
>>>> substitute U+FFFD replacement characters following Unicode guidance when
>>>> output is directed to a device and transcoding is necessary.
>>>> - Poll 8: P2093R6: Neither <format> nor <print> facilities require
>>>> an explicit program-controlled error handling mechanism for violations of
>>>> encoding expectations.
>>>> - Poll 9: P2093R6: Use of UTF-8 as the literal encoding is
>>>> sufficient for <format> and <print> facilities to assume that the format
>>>> string and output of all formatters is UTF-8 encoded.
>>>> - Poll 10: P2093R6: Use of a literal encoding other than UTF-8 is
>>>> sufficient for <format> and <print> facilities to assume a particular
>>>> encoding for the format string and output of formatters.
>>>> - Poll 11: P2093R6: Support for implicit encoding conversions will
>>>> only be possible when an encoding assumption is implicitly or explicitly
>>>> present.
>>>>
>>>> Assuming good consensus on those polls, we'll poll forwarding P2093R6
>>>> to LEWG again with direction to revise the paper to align with SG16
>>>> feedback. At a minimum, a revision will be expected to record SG16
>>>> direction and rationale. In order to avoid spending more SG16 telecon time
>>>> on this paper, we'll look for a volunteer to review the updated revision
>>>> and report back to SG16.
>>>>
>>>> - Poll X: P02093R6: Direct Victor to revise the paper to reflect
>>>> SG16 rationale and guidance, delegate review of a future revision to XXX,
>>>> and forward to LEWG for inclusion in C++23 pending review confirmation.
>>>>
>>>> Tom.
>>>>
>>>>
>>>> --
>>>> SG16 mailing list
>>>> SG16_at_[hidden]
>>>> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>>>>
>>> --
>> SG16 mailing list
>> SG16_at_[hidden]
>> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>>
>
Received on 2021-06-26 09:27:48