SG16 will hold a meeting tomorrow, Wednesday, January 28th, at 19:30 UTC (timezone conversion).

The agenda is:

We briefly started discussing P3876R0 during the 2026-01-14 SG16 meeting. I haven't published notes for that meeting because I lacked edit access to the new 2026 telecons wiki. I seem to have access now, so I'll try to get those notes published today. There wasn't much time for discussion, so there isn't much to summarize beyond Jan's presentation of the paper.

P3904R0 seeks to preserve the values of code units that are not part of a well-defined Unicode code unit sequence (e.g., a lone surrogate) when formatting std::filesystem::path objects for the ordinary literal encoding when that encoding is UTF-8. The idea is, given an ill-formed code unit sequence (e.g., L'\xD800'), rather than encoding a U+FFFD replacement character, to encode the code unit value in WTF-8; an extension of UTF-8 that encodes lone surrogate code points as if they were valid Unicode scalar values. This transformation has the downside of producing text that is not well-formed UTF-8 (substituting a replacement character ensures well-formed UTF-8), but has the upside of preserving invalid code unit sequences in a way that allows the original path to be recovered. Note that common filesystems that use 16-bit code units, such as on Windows, do not require filesystem paths to be well-formed UTF-16. Also note that std::format() and std::print() support use of the "?" formatting option to produce a value preserving rendering of ill-formed code unit sequences; given a lone surrogate such as U+D800, use of that option would produce "\u{d800}" instead of a replacement character today. We therefore have a way to do round-trip preservation of filesystem paths today (but not via WTF-8; at least not without an additional explicit translation step).

Tom.