Since I have yet to publish minutes from our last meeting, here are the raw notes I took for our previous discussion of P2845R0.
- P2845R0: Formatting of std::filesystem::path
- Victor summarized the history and prior challenges:
- P1836 did not handle non-printable characters.
- P1836 specified to use the native() member of
std::filesystem::path which might include transcoding.
- This paper proposes a formatter that does proper transcoding
and substitutes escapes for non-printable characters and
ill-formed code units.
- Victor noticed a missing doublequote in the example in section
3, Problems.
- Victor: There is a R1 revision that fixes some minor issues.
- Corentin: Would backslash path delimiters on Windows be
escaped?
- Victor: Yes, that is what quoted() does.
- Victor: That is kind of weird, but consistent.
- Victor: We could provide a specifier to choose alternate
behavior.
- Corentin: What about "{:?}"
- Victor: The escaped format is proposed as the default
behavior.
- Charlie: For escaping, we need some lattitude to choose an
alternate escape character since backslash in paths has an
important meaning.
- Charlie: That would make things inconsistent between platforms
and use of a different escape would be weird.
- PBrett: How about a specifier to choose a different escape
character?
- Victor: That would be cumbersome; other options are possible
including transformation.
- Victor: For the user, I think it is good to have an escaped
and non-escaped variant.
- Tom: Use cases; text, preserved, punycode, use in shell
scripts, etc... Most transformations should be done outside of
formatting.
- Tom: Do we want more than one format?
- Corentin: I think the default behavior should just escape
Unicode stuff, use the debug formatting for other stuff.
- Victor: Quoting is useful, but not always needed.
- Tom: A specifier could be added to quote if needed.
- PBrett: There are several applications:
- I need to exactly preserve the filename; serialize the code
units.
- I need the user to be able to read the filename in an
inteligent way.
- PBrett: I don't think the paper is clearly defining the
problem it wants to solve.
- PBrett: In glib, you can clearly get the display name as valid
UTF-8, or you can get a byte array.
- Victor: The goal is to fix the issues in the previous P1636
paper.
- Victor: We can address additional use cases if needed.
- Zach: Python does what this paper is proposing; Windows path
separators are doubled.
- Zach: In Python, if you want it printed unformatted, you print
it as a string and we can do likewise.
- Zach: I think some kind of escaping is needed and quoting
works for that.
- See Corentin's email with the CE link that demonstrates Python
behavior.
- Jens: Due to the quirks in std::filesystem::path, I think this
paper should cover the motivation and design space and not just
fix issues with P1636.
- Jens: The paper should discuss, for example, the implication
of backslashes added to paths as part of escaping.
- PBrett: Agreed, the paper should expand on these details.
- PBrett: We haven't discussed encoding issues yet; this will
need another review.
Tom.
SG16 will hold a telecon on Wednesday, July 12th, at 19:30 UTC (timezone conversion).
The agenda follows.
- P1030R5: std::filesystem::path_view
- Discuss what to do in lieu of overloads with std::locale parameters.
- P2845R0: Formatting of std::filesystem::path
- Continue review.
- LWG 3944: Formatters converting sequences of char to sequences of wchar_t
- Initial review.
SG16 has discussed P1030 on several occasions including the 2018-05-30 telecon, in Cologne, and in Belfast. Niall recently requested additional SG16 review following LEWG's request to remove overloads with a std::locale parameter that were added in P1030R4 (which SG16 has not reviewed) during LEWG's review of P1030R5 in Varna. The minutes are not perfectly clear, but it appears that at least one of the concerns is the presence of std::locale parameters on functions declared with constexpr; which doesn't make a lot of sense. I'm sure Niall, Victor, or Zach can provide more details; they were all in attendance. Our discussion goal will be determine if there is strong motivation to argue in favor of preserving support for std::locale (with any necessary ancillary changes), or whether we agree with LEWG, or whether there is an alternative solution we would prefer to promote.
We discussed P2845R0 at our 2023-06-07 telecon (which I have yet to publish a meeting summary for; I'm sorry; I'm working on it; that link will work eventually). We did not finish that discussion or poll any aspects of the paper, so we'll continue our review with this meeting.
If time permits, we'll start discussing LWG 3944. The issue concerns handling of ranges of type char when formatting for wchar_t.
Tom.