C++ Logo

sg16

Advanced search

Re: [SG16-Unicode] P1689: Encoding of filenames for interchange

From: Niall Douglas <s_sourceforge_at_[hidden]>
Date: Thu, 5 Sep 2019 13:57:03 +0100
On 05/09/2019 12:11, Lyberta wrote:
> Do we really expect C++20 build systems to run on filesystem paths not
> representable as UTF-8? Users who do that really shoot themselves in the
> foot.

As Tom likes to say, EBCDIC.

If something being standardised sets the source encoding for another
part of the standard which consumes, absolutely insist on UTF-8. Indeed,
the native string view object for C I've been discussing offline with
WG14 simply imposes UTF-8 everywhere, period.

But if somebody else is choosing the source encoding, and moreover can
vary source encoding across build tooling runs, then build tooling must
be a taker on this.

As I like to say, you can't standardise what you don't control.

P1689 ought to be a taker and support filenames which are invalid UTF.
I'm particularly thinking of temporary file names which build systems
often deal with, as some temporary file name generators make invalid
UTF. And that's totally legal on the filesystem.

Niall

Received on 2019-09-05 14:57:07