Date: Wed, 23 Aug 2023 12:06:31 -0400
On 8/23/23 11:40 AM, Tom Honermann via SG16 wrote:
>
> SG16 will hold a telecon on Wednesday, August 23rd, at 19:30 UTC
> (timezone conversion
> <https://www.timeanddate.com/worldclock/converter.html?iso=20230823T193000&p1=1440&p2=tz_pt&p3=tz_mt&p4=tz_ct&p5=tz_et&p6=tz_cest>).
>
> That is *today*, *in about 4 hours*. Obviously, I'm still struggling
> to keep up with all the things.
>
> The agenda follows.
>
> * P2909R0: Dude, where’s my char? <https://wg21.link/p2909r0>
> * P2728R6: Unicode in the Library, Part 1: UTF Transcoding
> <https://wg21.link/p2728r6>
>
> P2902R0 is a new paper from Victor that seeks to establish portable
> behavior for formatting of objects of type char regardless of the
> implementation-defined signedness of char.
>
> P2728 was last discussed during the 2023-05-10 SG16 telecon
> <https://github.com/sg16-unicode/sg16-meetings#may-10th-2023>. I have
> continued to hear feedback that the motivation for the proposal, as
> presented in the paper, is lacking. I'd like to focus on giving Zach
> specific, concrete, and clear direction regarding how to improve the
> paper in this respect. Comments should focus on what is perceived to
> be missing and what changes would fill those gaps. Following that, I
> would like to focus on error handling. The proposal includes a
> transcoding_error_handler concept and a use_replacement_character
> class that models that concept and that provides the default error
> handling behavior. The error handler is specified to take a message
> passed as an object of type std::string_view, but does not specify the
> contents of the message. The error handler is constrained to return a
> single value of type char32_t and is not given access to the source
> text (except. possibly via the message, which is always char-based).
> Is this sufficient? If not, what changes are needed?
>
Zach, with regard to error handling, I would like for the paper to
present examples of error handlers that implement each of the following:
* The policies from Unicode PR-121
<http://unicode.org/review/pr-121.html>:
o Replace the entire ill-formed subsequence by a single U+FFFD.
o Replace each maximal subpart of the ill-formed subsequence by a
single U+FFFD.
o Replace each code unit of the ill-formed subsequence by a single
U+FFFD.
* Remove each ill-formed subsequence (without substituting any
replacement characters).
* Replace each ill-formed subsequence with an escaped representation
of the values of the invalid code units.
(I believe only one of these is possible with the proposal as it
currently stands.)
Tom.
> Tom.
>
>
>
> SG16 will hold a telecon on Wednesday, August 23rd, at 19:30 UTC
> (timezone conversion
> <https://www.timeanddate.com/worldclock/converter.html?iso=20230823T193000&p1=1440&p2=tz_pt&p3=tz_mt&p4=tz_ct&p5=tz_et&p6=tz_cest>).
>
> That is *today*, *in about 4 hours*. Obviously, I'm still struggling
> to keep up with all the things.
>
> The agenda follows.
>
> * P2909R0: Dude, where’s my char? <https://wg21.link/p2909r0>
> * P2728R6: Unicode in the Library, Part 1: UTF Transcoding
> <https://wg21.link/p2728r6>
>
> P2902R0 is a new paper from Victor that seeks to establish portable
> behavior for formatting of objects of type char regardless of the
> implementation-defined signedness of char.
>
> P2728 was last discussed during the 2023-05-10 SG16 telecon
> <https://github.com/sg16-unicode/sg16-meetings#may-10th-2023>. I have
> continued to hear feedback that the motivation for the proposal, as
> presented in the paper, is lacking. I'd like to focus on giving Zach
> specific, concrete, and clear direction regarding how to improve the
> paper in this respect. Comments should focus on what is perceived to
> be missing and what changes would fill those gaps. Following that, I
> would like to focus on error handling. The proposal includes a
> transcoding_error_handler concept and a use_replacement_character
> class that models that concept and that provides the default error
> handling behavior. The error handler is specified to take a message
> passed as an object of type std::string_view, but does not specify the
> contents of the message. The error handler is constrained to return a
> single value of type char32_t and is not given access to the source
> text (except. possibly via the message, which is always char-based).
> Is this sufficient? If not, what changes are needed?
>
Zach, with regard to error handling, I would like for the paper to
present examples of error handlers that implement each of the following:
* The policies from Unicode PR-121
<http://unicode.org/review/pr-121.html>:
o Replace the entire ill-formed subsequence by a single U+FFFD.
o Replace each maximal subpart of the ill-formed subsequence by a
single U+FFFD.
o Replace each code unit of the ill-formed subsequence by a single
U+FFFD.
* Remove each ill-formed subsequence (without substituting any
replacement characters).
* Replace each ill-formed subsequence with an escaped representation
of the values of the invalid code units.
(I believe only one of these is possible with the proposal as it
currently stands.)
Tom.
> Tom.
>
>
Received on 2023-08-23 16:06:32