SG16 will hold a meeting tomorrow, Wednesday, February 11th, at 19:30 UTC (timezone conversion).

The agenda is:

This is the same agenda as our last meeting on 2026-01-28. As you may recall, we spent the entire meeting discussing P3876R0, but did not quite conclude the discussion. We took six polls, all of which confirmed the direction of the paper. In this meeting, we'll review the wording and if all goes well, poll to forward.

P3904R0 seeks to preserve the values of code units that are not part of a well-defined Unicode code unit sequence (e.g., a lone surrogate) when formatting std::filesystem::path objects for the ordinary literal encoding when that encoding is UTF-8. The idea is, given an ill-formed code unit sequence (e.g., L'\xD800'), rather than encoding a U+FFFD replacement character, to encode the code unit value in WTF-8; an extension of UTF-8 that encodes lone surrogate code points as if they were valid Unicode scalar values. This transformation has the downside of producing text that is not well-formed UTF-8 (substituting a replacement character ensures well-formed UTF-8), but has the upside of preserving invalid code unit sequences in a way that allows the original path to be recovered. Note that common filesystems that use 16-bit code units, such as on Windows, do not require filesystem paths to be well-formed UTF-16. Also note that std::format() and std::print() support use of the "?" formatting option to produce a value preserving rendering of ill-formed code unit sequences; given a lone surrogate such as U+D800, use of that option would produce "\u{d800}" instead of a replacement character today. We therefore have a way to do round-trip preservation of filesystem paths today (but not via WTF-8; at least not without an additional explicit translation step).

Tom.