Poll 1: P2093R6: <format> and <print> facilities should have consistent behavior with respect to encoding expectations for the format string.
N.B. Whether encoding expectations exist may depend on other
factors, such as what encoding is used for the literal
encoding.
Poll 2: P2093R6: <format> and <print> facilities should have consistent behavior with respect to encoding expectations for the output of formatters.
Poll 3: P2093R6: Regardless of format string encoding assumptions, <format> facilities (but not <print> facilities) may be used to format binary data.
N.B. the implementation does not inspect the result of a std::format() invocation, but this is not necessarily true for std::print().
N.B. the poll is phrased so as to be independent of whether output is directed to a device or not.
Poll 5: P2093R6: <print> facilities exhibit
undefined behavior when a format string or formatter output
does not match encoding expectations and output is directed to
a device that has encoding expectations.
Poll 6: P2093R6: <print> facility implementors are encouraged to provide a run-time means for diagnosing format strings and formatter output that does not match encoding expectations.
Poll 7: P2093R6: <print> facility implementors are encouraged to substitute U+FFFD replacement characters following Unicode guidance when output is directed to a device and transcoding is necessary.
N.B. transcoding is not necessarily required; e.g., when encoding expectations are for UTF-8 and the device interface expects UTF-8.
Poll 8: P2093R6: Neither <format> nor <print> facilities require an explicit program-controlled error handling mechanism for violations of encoding expectations.
N.B. such error handling mechanisms could be introduced in
the future.
SG16 will hold a telecon on Wednesday, June 9th at 19:30 UTC (timezone conversion).
The agenda is:
- D2295R4: Support for UTF-8 as a portable source file encoding
- Review updated wording produced through collaboration between Corentin, Jens, and Hubert to resolve earlier feedback at https://lists.isocpp.org/sg16/2021/04/2353.php.
- P2093R6: Formatted output
- Continue discussion and poll for consensus on answers to the following questions:
- How should invalidly encoded text be handled when transcoding for the purpose of writing directly to a device interface?
- Is use of UTF-8 as the literal encoding a sufficient indicator that all input fed to std::format() and std::print() (including the format string, programmer supplied field arguments, and locale provided text) will be UTF-8 encoded?
- Is the literal encoding a sufficient indicator in general that all input fed to std::format() and std::print() (including the format string, programmer supplied field arguments, and locale provided text) will be provided in an encoding compatible with the literal encoding?
- What are the implications for future support of std::print("{} {} {} {}", L"Wide text", u8"UTF-8 text", u"UTF-16 text", U"UTF-32 text")?
Discussion of D2295R4 is contingent on updated wording being available.
For P2093R6, I believe we have sufficiently discussed transcoding concerns (including concerns related to locale provided field arguments) to be able to answer the first question above with strong consensus. I likewise suspect that further discussion on the third question is unnecessary and that we are reasonably well positioned to poll it. We began discussion around the second question at the last telecon, but I feel that some more discussion is needed. We haven't discussed question four at all, but I expect to arrive at a clearly objective answer for that one.
I would like for us to complete discussion and polling for P2093 during this telecon. I don't know if that is realistic, but that is what we'll aim for. I will reply to this email with a set of candidate polls in advance of the telecon with the hope that we'll be able to reduce time negotiating polls during the telecon itself.
Tom.