C++ Logo

sg16

Advanced search

Re: [isocpp-sg16] Agenda for the 2026-01-28 SG16 meeting

From: Tom Honermann <tom_at_[hidden]>
Date: Wed, 28 Jan 2026 14:25:23 -0500
This meeting is starting in 5 minutes.

Tom.

On 1/27/26 4:38 PM, Tom Honermann via SG16 wrote:
>
> SG16 will hold a meeting *tomorrow*, Wednesday, January 28th, at 19:30
> UTC (timezone conversion
> <https://www.timeanddate.com/worldclock/converter.html?iso=20260128T193000&p1=1440&p2=tz_pst&p3=tz_mst&p4=tz_cst&p5=tz_est&p6=tz_cet>).
>
> The agenda is:
>
> * P3876R0: Extending <charconv> support to more character types
> <https://wg21.link/p3876r0>.
> * P3904R0: When paths go WTF: making formatting lossless
> <https://wg21.link/p3904r0>.
>
> We briefly started discussing *P3876R0* during the 2026-01-14 SG16
> meeting. I haven't published notes for that meeting because I lacked
> edit access to the new 2026 telecons wiki
> <https://wiki.isocpp.org/2026_Telecons>. I seem to have access now, so
> I'll try to get those notes published today. There wasn't much time
> for discussion, so there isn't much to summarize beyond Jan's
> presentation of the paper.
>
> *P3904R0* seeks to preserve the values of code units that are not part
> of a well-defined Unicode code unit sequence (e.g., a lone surrogate)
> when formatting std::filesystem::path objects for the ordinary literal
> encoding when that encoding is UTF-8. The idea is, given an ill-formed
> code unit sequence (e.g., L'\xD800'), rather than encoding a U+FFFD
> replacement character, to encode the code unit value in WTF-8
> <https://wtf-8.codeberg.page/>; an extension of UTF-8 that encodes
> lone surrogate code points as if they were valid Unicode scalar
> values. This transformation has the downside of producing text that is
> not well-formed UTF-8 (substituting a replacement character ensures
> well-formed UTF-8), but has the upside of preserving invalid code unit
> sequences in a way that allows the original path to be recovered. Note
> that common filesystems that use 16-bit code units, such as on
> Windows, do not require filesystem paths to be well-formed UTF-16.
> Also note that std::format() and std::print() support use of the "?"
> formatting option to produce a value preserving rendering of
> ill-formed code unit sequences; given a lone surrogate such as U+D800,
> use of that option would produce "\u{d800}" instead of a replacement
> character today. We therefore have a way to do round-trip preservation
> of filesystem paths today (but not via WTF-8; at least not without an
> additional explicit translation step).
>
> Tom.
>
>
>

Received on 2026-01-28 19:25:26