ISOCPP sg16 List: Re: [isocpp-sg16] Agenda for the 2026-02-11 SG16 meeting

From: Jan Schultke <janschultke_at_[hidden]>
Date: Sun, 22 Feb 2026 18:33:08 +0100

Some of the concerns that have arisen so far are being addressed in
https://isocpp.org/files/papers/P3876R1.html, which is going to be in
Monday's mailing.

On Sat, 21 Feb 2026 at 20:48, Tom Honermann <tom_at_[hidden]> wrote:

> My rough notes for this meeting are available on the (✨NEW!✨) WG21 wiki
> here <https://wiki.isocpp.org/2026_Telecons:SG16Teleconference2026-02-11>.
>
> The P3876 GH issue <https://github.com/cplusplus/papers/issues/2549> has
> been updated to indicate that SG16 review has not concluded and that we are
> now awaiting a revision with updated wording.
>
> As promised, Corentin initiated a thread on the SG16 mailing list
> <https://lists.isocpp.org/sg16/2026/02/4670.php> concerning use of the
> "literal encoding" in standard library wording where the "encoding of the C
> locale" might be more appropriate. Please review and offer any comments
> there.
>
> Jan filed LWG 4522 (Clarify that std::format transcodes for std::wformat
> strings) <https://cplusplus.github.io/LWG/issue4522> to follow up on the
> observation that numeric formatting specified in terms of to_chars()
> requires transcoding when formatting for wide-character strings.
>
> Tom.
> On 2/10/26 10:02 PM, Tom Honermann via SG16 wrote:
>
> SG16 will hold a meeting *tomorrow*, Wednesday, February 11th, at 19:30
> UTC (timezone conversion
> <https://www.timeanddate.com/worldclock/converter.html?iso=20260211T193000&p1=1440&p2=tz_pst&p3=tz_mst&p4=tz_cst&p5=tz_est&p6=tz_cet>
> ).
>
> The agenda is:
>
> - P3876R0: Extending <charconv> support to more character types
> <https://wg21.link/p3876r0>.
> - P3904R0: When paths go WTF: making formatting lossless
> <https://wg21.link/p3904r0>.
>
> This is the same agenda as our last meeting on 2026-01-28
> <https://wiki.isocpp.org/2026_Telecons:SG16Teleconference2026-01-28>. As
> you may recall, we spent the entire meeting discussing *P3876R0*, but did
> not quite conclude the discussion. We took six polls, all of which
> confirmed the direction of the paper. In this meeting, we'll review the
> wording and if all goes well, poll to forward.
>
> *P3904R0* seeks to preserve the values of code units that are not part of
> a well-defined Unicode code unit sequence (e.g., a lone surrogate) when
> formatting std::filesystem::path objects for the ordinary literal
> encoding when that encoding is UTF-8. The idea is, given an ill-formed code
> unit sequence (e.g., L'\xD800'), rather than encoding a U+FFFD
> replacement character, to encode the code unit value in WTF-8
> <https://wtf-8.codeberg.page/>; an extension of UTF-8 that encodes lone
> surrogate code points as if they were valid Unicode scalar values. This
> transformation has the downside of producing text that is not well-formed
> UTF-8 (substituting a replacement character ensures well-formed UTF-8), but
> has the upside of preserving invalid code unit sequences in a way that
> allows the original path to be recovered. Note that common filesystems that
> use 16-bit code units, such as on Windows, do not require filesystem paths
> to be well-formed UTF-16. Also note that std::format() and std::print() support
> use of the "?" formatting option to produce a value preserving rendering of
> ill-formed code unit sequences; given a lone surrogate such as U+D800, use
> of that option would produce "\u{d800}" instead of a replacement
> character today. We therefore have a way to do round-trip preservation of
> filesystem paths today (but not via WTF-8; at least not without an
> additional explicit translation step).
>
> Tom.
>
>

Received on 2026-02-22 17:33:23