Date: Mon, 11 Sep 2023 17:44:50 -0400
SG16 will hold a telecon on Wednesday, September 13th, at 19:30 UTC
(timezone conversion
<https://www.timeanddate.com/worldclock/converter.html?iso=20230913T193000&p1=1440&p2=tz_pt&p3=tz_mt&p4=tz_ct&p5=tz_et&p6=tz_cest>).
The agenda follows.
* P2845R2: Formatting of std::filesystem::path
<https://wg21.link/p2845r2>:
o Continue review.
* P2728R6: Unicode in the Library, Part 1: UTF Transcoding
<https://wg21.link/p2728r6>:
o Continue review.
We last discussed P2845R0 <https://wg21.link/p2845r0> during the
2023-07-12 SG16 meeting
<https://github.com/sg16-unicode/sg16-meetings/tree/master#july-12th-2023>.
The new revision incorporates feedback previously provided. We'll
continue review and, if all looks good, poll to forward.
We continued our review of P2728R6 <https://wg21.link/p2728r6> at our
last meeting on 2023-08-23
<https://github.com/sg16-unicode/sg16-meetings/tree/master#august-23rd-2023>
and it remains the current revision. There are still some design aspects
in the paper that I am not confident that SG16 has consensus for. These
include:
1. Our last discussion highlighted the limitations imposed on error
handling by the iterator interface. Assuming that we agree on a
simplified error handling approach, I would like to review the
transcoding_error_handler concept, the utility of the msg parameter
that is passed to an error handler, and the inability to specify an
error handler to be used with the utfN_view view adapters. The paper
provides an explanation in section 5.4.2 for why custom error
handling is not proposed for the utfN_view adapters, but does not
provide supporting rationale. Does SG16 agree with this approach?
2. As discussed in section 5.2.3, previous SG16 polls have established
strong support for relying on code unit type to infer encoding. The
introduction of the as_charN_t views provides means to interpret a
sequence of code unit values stored in another type as a sequence of
charN_t. This suggests the possibility of removing the proposed
std::uc::format enumeration in favor of reliance on charN_t code
unit types everywhere. I don't want to engage in on-the-fly design
discussion; the question is whether SG16 would like to pursue such
simplification. Are there other opportunities for simplification
that SG16 would like to pursue?
Finally, I'd like to talk about the bigger picture and how these UTF
transcoders fit into it. My expectation is that SG16 will eventually
review proposals that cover encoding and decoding of UTF (and other)
encodings with extensive error handling capabilities as well as features
that support transcoding of text in char and wchar_t based storage
across arbitrary encodings . Do we expect the proposed
use_replacement_character error handler or other types that model
transcoding_error_handler to be usable with these other facilities? Are
the names what we would suggest for their intended scope?
Tom.
(timezone conversion
<https://www.timeanddate.com/worldclock/converter.html?iso=20230913T193000&p1=1440&p2=tz_pt&p3=tz_mt&p4=tz_ct&p5=tz_et&p6=tz_cest>).
The agenda follows.
* P2845R2: Formatting of std::filesystem::path
<https://wg21.link/p2845r2>:
o Continue review.
* P2728R6: Unicode in the Library, Part 1: UTF Transcoding
<https://wg21.link/p2728r6>:
o Continue review.
We last discussed P2845R0 <https://wg21.link/p2845r0> during the
2023-07-12 SG16 meeting
<https://github.com/sg16-unicode/sg16-meetings/tree/master#july-12th-2023>.
The new revision incorporates feedback previously provided. We'll
continue review and, if all looks good, poll to forward.
We continued our review of P2728R6 <https://wg21.link/p2728r6> at our
last meeting on 2023-08-23
<https://github.com/sg16-unicode/sg16-meetings/tree/master#august-23rd-2023>
and it remains the current revision. There are still some design aspects
in the paper that I am not confident that SG16 has consensus for. These
include:
1. Our last discussion highlighted the limitations imposed on error
handling by the iterator interface. Assuming that we agree on a
simplified error handling approach, I would like to review the
transcoding_error_handler concept, the utility of the msg parameter
that is passed to an error handler, and the inability to specify an
error handler to be used with the utfN_view view adapters. The paper
provides an explanation in section 5.4.2 for why custom error
handling is not proposed for the utfN_view adapters, but does not
provide supporting rationale. Does SG16 agree with this approach?
2. As discussed in section 5.2.3, previous SG16 polls have established
strong support for relying on code unit type to infer encoding. The
introduction of the as_charN_t views provides means to interpret a
sequence of code unit values stored in another type as a sequence of
charN_t. This suggests the possibility of removing the proposed
std::uc::format enumeration in favor of reliance on charN_t code
unit types everywhere. I don't want to engage in on-the-fly design
discussion; the question is whether SG16 would like to pursue such
simplification. Are there other opportunities for simplification
that SG16 would like to pursue?
Finally, I'd like to talk about the bigger picture and how these UTF
transcoders fit into it. My expectation is that SG16 will eventually
review proposals that cover encoding and decoding of UTF (and other)
encodings with extensive error handling capabilities as well as features
that support transcoding of text in char and wchar_t based storage
across arbitrary encodings . Do we expect the proposed
use_replacement_character error handler or other types that model
transcoding_error_handler to be usable with these other facilities? Are
the names what we would suggest for their intended scope?
Tom.
Received on 2023-09-11 21:44:52