C++ Logo

sg16

Advanced search

Re: Agenda for the 2023-08-23 SG16 telecon

From: Tom Honermann <tom_at_[hidden]>
Date: Wed, 23 Aug 2023 12:06:31 -0400
On 8/23/23 11:40 AM, Tom Honermann via SG16 wrote:
>
> SG16 will hold a telecon on Wednesday, August 23rd, at 19:30 UTC
> (timezone conversion
> <https://www.timeanddate.com/worldclock/converter.html?iso=20230823T193000&p1=1440&p2=tz_pt&p3=tz_mt&p4=tz_ct&p5=tz_et&p6=tz_cest>).
>
> That is *today*, *in about 4 hours*. Obviously, I'm still struggling
> to keep up with all the things.
>
> The agenda follows.
>
> * P2909R0: Dude, where’s my char? <https://wg21.link/p2909r0>
> * P2728R6: Unicode in the Library, Part 1: UTF Transcoding
> <https://wg21.link/p2728r6>
>
> P2902R0 is a new paper from Victor that seeks to establish portable
> behavior for formatting of objects of type char regardless of the
> implementation-defined signedness of char.
>
> P2728 was last discussed during the 2023-05-10 SG16 telecon
> <https://github.com/sg16-unicode/sg16-meetings#may-10th-2023>. I have
> continued to hear feedback that the motivation for the proposal, as
> presented in the paper, is lacking. I'd like to focus on giving Zach
> specific, concrete, and clear direction regarding how to improve the
> paper in this respect. Comments should focus on what is perceived to
> be missing and what changes would fill those gaps. Following that, I
> would like to focus on error handling. The proposal includes a
> transcoding_error_handler concept and a use_replacement_character
> class that models that concept and that provides the default error
> handling behavior. The error handler is specified to take a message
> passed as an object of type std::string_view, but does not specify the
> contents of the message. The error handler is constrained to return a
> single value of type char32_t and is not given access to the source
> text (except. possibly via the message, which is always char-based).
> Is this sufficient? If not, what changes are needed?
>
Zach, with regard to error handling, I would like for the paper to
present examples of error handlers that implement each of the following:

  * The policies from Unicode PR-121
    <http://unicode.org/review/pr-121.html>:
      o Replace the entire ill-formed subsequence by a single U+FFFD.
      o Replace each maximal subpart of the ill-formed subsequence by a
        single U+FFFD.
      o Replace each code unit of the ill-formed subsequence by a single
        U+FFFD.
  * Remove each ill-formed subsequence (without substituting any
    replacement characters).
  * Replace each ill-formed subsequence with an escaped representation
    of the values of the invalid code units.

(I believe only one of these is possible with the proposal as it
currently stands.)

Tom.

> Tom.
>
>

Received on 2023-08-23 16:06:32