Date: Sat, 21 Feb 2026 14:48:23 -0500
My rough notes for this meeting are available on the (✨NEW!✨) WG21 wiki
here <https://wiki.isocpp.org/2026_Telecons:SG16Teleconference2026-02-11>.
The P3876 GH issue <https://github.com/cplusplus/papers/issues/2549> has
been updated to indicate that SG16 review has not concluded and that we
are now awaiting a revision with updated wording.
As promised, Corentin initiated a thread on the SG16 mailing list
<https://lists.isocpp.org/sg16/2026/02/4670.php> concerning use of the
"literal encoding" in standard library wording where the "encoding of
the C locale" might be more appropriate. Please review and offer any
comments there.
Jan filed LWG 4522 (Clarify that std::format transcodes for std::wformat
strings) <https://cplusplus.github.io/LWG/issue4522> to follow up on the
observation that numeric formatting specified in terms of to_chars()
requires transcoding when formatting for wide-character strings.
Tom.
On 2/10/26 10:02 PM, Tom Honermann via SG16 wrote:
>
> SG16 will hold a meeting *tomorrow*, Wednesday, February 11th, at
> 19:30 UTC (timezone conversion
> <https://www.timeanddate.com/worldclock/converter.html?iso=20260211T193000&p1=1440&p2=tz_pst&p3=tz_mst&p4=tz_cst&p5=tz_est&p6=tz_cet>).
>
> The agenda is:
>
> * P3876R0: Extending <charconv> support to more character types
> <https://wg21.link/p3876r0>.
> * P3904R0: When paths go WTF: making formatting lossless
> <https://wg21.link/p3904r0>.
>
> This is the same agenda as our last meeting on 2026-01-28
> <https://wiki.isocpp.org/2026_Telecons:SG16Teleconference2026-01-28>.
> As you may recall, we spent the entire meeting discussing *P3876R0*,
> but did not quite conclude the discussion. We took six polls, all of
> which confirmed the direction of the paper. In this meeting, we'll
> review the wording and if all goes well, poll to forward.
>
> *P3904R0* seeks to preserve the values of code units that are not part
> of a well-defined Unicode code unit sequence (e.g., a lone surrogate)
> when formatting std::filesystem::path objects for the ordinary literal
> encoding when that encoding is UTF-8. The idea is, given an ill-formed
> code unit sequence (e.g., L'\xD800'), rather than encoding a U+FFFD
> replacement character, to encode the code unit value in WTF-8
> <https://wtf-8.codeberg.page/>; an extension of UTF-8 that encodes
> lone surrogate code points as if they were valid Unicode scalar
> values. This transformation has the downside of producing text that is
> not well-formed UTF-8 (substituting a replacement character ensures
> well-formed UTF-8), but has the upside of preserving invalid code unit
> sequences in a way that allows the original path to be recovered. Note
> that common filesystems that use 16-bit code units, such as on
> Windows, do not require filesystem paths to be well-formed UTF-16.
> Also note that std::format() and std::print() support use of the "?"
> formatting option to produce a value preserving rendering of
> ill-formed code unit sequences; given a lone surrogate such as U+D800,
> use of that option would produce "\u{d800}" instead of a replacement
> character today. We therefore have a way to do round-trip preservation
> of filesystem paths today (but not via WTF-8; at least not without an
> additional explicit translation step).
>
> Tom.
>
>
here <https://wiki.isocpp.org/2026_Telecons:SG16Teleconference2026-02-11>.
The P3876 GH issue <https://github.com/cplusplus/papers/issues/2549> has
been updated to indicate that SG16 review has not concluded and that we
are now awaiting a revision with updated wording.
As promised, Corentin initiated a thread on the SG16 mailing list
<https://lists.isocpp.org/sg16/2026/02/4670.php> concerning use of the
"literal encoding" in standard library wording where the "encoding of
the C locale" might be more appropriate. Please review and offer any
comments there.
Jan filed LWG 4522 (Clarify that std::format transcodes for std::wformat
strings) <https://cplusplus.github.io/LWG/issue4522> to follow up on the
observation that numeric formatting specified in terms of to_chars()
requires transcoding when formatting for wide-character strings.
Tom.
On 2/10/26 10:02 PM, Tom Honermann via SG16 wrote:
>
> SG16 will hold a meeting *tomorrow*, Wednesday, February 11th, at
> 19:30 UTC (timezone conversion
> <https://www.timeanddate.com/worldclock/converter.html?iso=20260211T193000&p1=1440&p2=tz_pst&p3=tz_mst&p4=tz_cst&p5=tz_est&p6=tz_cet>).
>
> The agenda is:
>
> * P3876R0: Extending <charconv> support to more character types
> <https://wg21.link/p3876r0>.
> * P3904R0: When paths go WTF: making formatting lossless
> <https://wg21.link/p3904r0>.
>
> This is the same agenda as our last meeting on 2026-01-28
> <https://wiki.isocpp.org/2026_Telecons:SG16Teleconference2026-01-28>.
> As you may recall, we spent the entire meeting discussing *P3876R0*,
> but did not quite conclude the discussion. We took six polls, all of
> which confirmed the direction of the paper. In this meeting, we'll
> review the wording and if all goes well, poll to forward.
>
> *P3904R0* seeks to preserve the values of code units that are not part
> of a well-defined Unicode code unit sequence (e.g., a lone surrogate)
> when formatting std::filesystem::path objects for the ordinary literal
> encoding when that encoding is UTF-8. The idea is, given an ill-formed
> code unit sequence (e.g., L'\xD800'), rather than encoding a U+FFFD
> replacement character, to encode the code unit value in WTF-8
> <https://wtf-8.codeberg.page/>; an extension of UTF-8 that encodes
> lone surrogate code points as if they were valid Unicode scalar
> values. This transformation has the downside of producing text that is
> not well-formed UTF-8 (substituting a replacement character ensures
> well-formed UTF-8), but has the upside of preserving invalid code unit
> sequences in a way that allows the original path to be recovered. Note
> that common filesystems that use 16-bit code units, such as on
> Windows, do not require filesystem paths to be well-formed UTF-16.
> Also note that std::format() and std::print() support use of the "?"
> formatting option to produce a value preserving rendering of
> ill-formed code unit sequences; given a lone surrogate such as U+D800,
> use of that option would produce "\u{d800}" instead of a replacement
> character today. We therefore have a way to do round-trip preservation
> of filesystem paths today (but not via WTF-8; at least not without an
> additional explicit translation step).
>
> Tom.
>
>
Received on 2026-02-21 19:48:29
