C++ Logo

sg16

Advanced search

Re: [SG16-Unicode] SG16 approval for LEWG to review std::filesystem::path_view

From: JeanHeyd Meneide <phdofthehouse_at_[hidden]>
Date: Wed, 3 Jul 2019 14:17:42 -0400
Dear Niall Douglas,

On Wed, Jul 3, 2019 at 1:11 PM Niall Douglas <s_sourceforge_at_[hidden]>
wrote:

> FYI I was completely unaware that SG16 had discussed P1030. Nobody told me.
>

      Apologies, that was my failing: I think I had only briefly mentioned
it during my first meeting Standards Meeting after I first met you in
Rapperswil, Switzerland, but I likely should have delivered the feedback
VIA e-mail rather than a brief gloss over in person in the most timid of
fashions.

I should add that the killing off of char input was strongly requested
> by Billy. I got the feeling it was a red line for him. I can understand
> why, from a MSVC-implementer perspective, and I have witnessed first
> hand the brokenness of char input to path on Windows.
>

     I am very glad that `char` is not included here. My only potential
concern is that Linux-exclusive users will cry out. But then again, so will
MSVC users with L"" strings. "Everyone suffers equally" is a bit of a cold
comfort, though, but it's completely understandable why.

     If my Unicode work is anywhere close to mildly successful, it might
make it possible to specify char and wchar_t overloads as conversions. But
that can be added at a much later date with no breakage to the proposal as
it is, which is great! Yay!

The aim is for path_view to be usually no worse than path, nothing more.
> If the input is in UTF-8, and the system API requires UTF-16, then you
> need to convert, same as for path. Unless you want to push mandatory
> #ifdef-ing onto the end user, which I don't think we want.
>

     Right, I think what was lost in the original example upon Tom's
reading was that it _always_ converted. That's not the case for path_view:
it will only convert if the passed-in encoding does not match the native
file system's encoding, and only do such a conversion when necessary. If
the user passes in UTF8 on POSIX, no converting will be done.

     Finally, I have one more... thing? It's not really a concern or a nit
with the paper, just a bit of sadness: had we standardized a c_string_view
of some form, we wouldn't need what will probably amount of "*Expects:
*ptr[size]
== 0 is true" on all the specifications on the constructors for the string
view / charX + pointer overloads. I agree with the reasoning in the paper
that there is rarely a case where users expect that to be the case, but
it's not exactly impossible: I have received a lovely crop of bug reports
from people seeing "std::string_view" overloads in my libraries and passing
non-null-terminated strings into them, because that's what string_view
promises. Whether or not your wording has an "expects" clause, it's not
difficult or hard to imagine substring or other similar pointer + size
manipulations to produce hell here.

      Then again, this is marked as the *path*_view type. Maybe that will
be enough visual indication to the user they should think carefully and not
just toss in random substrings. I certainly hope it is.

Sincerely,
JeanHeyd Meneide

Received on 2019-07-03 20:17:54