C++ Logo


Advanced search

Subject: Re: Execution encoding and the execution environment on Windows systems
From: Tom Honermann (tom_at_[hidden])
Date: 2021-01-05 11:08:41

On 1/5/21 8:46 AM, Thiago Macieira via SG16 wrote:
> On Tuesday, 5 January 2021 02:53:54 -03 Tom Honermann via SG16 wrote:
>> 8. We *could* specify a new portable entry point (a new main()
>> signature or an alternative to main()) that provides UTF-8 encoded
>> command line arguments (presumably with substitution characters in
>> place of non-transcodeable content).
> Using an alternate entry point is already possible and resolves the command-
> line problem: just use _wmain() and UTF-16.

Yes, but that is Windows specific.  What I had in mind was a new entry
point that portably provides a UTF-8 encoded command line, perhaps simply:

    int main(int argc, char8_t **argv) { ... }

Or perhaps with "raw" access to the command line in some way; perhaps
similarly to std::filesystem::path:

    namespace std {
     Â  class argument {
     Â Â Â  public:
    std::string_view   string()     const;
    std::wstring_view  wstring()    const;
    std::u8string_view u8string()   const;
    std::u16string_view u16string() const;
    std::u32string_view u32string() const;
     Â  };

    int main(std::initializer_list<std::argument> args) {... }

(Arguments not withstanding regarding how to provide the argument range,
avoid startup overhead, handle memory allocation, whether the command
line should be accessible outside of main (e.g., __argv), whether the
arguments should be mutable, whether the argument range should be a
mutable container, which encodings to provide, what the return type of
the converters should be, etc...)

> The pipe problem is not local to
> the current process, but the *other* one, so it can't be fixed locally.
Indeed, a new entry point doesn't address this issue at all; I went on a
tangent at the end there.  The point there was that we can assume the
availability of a (possibly ill-formed) UTF-16 encoded command line on
Windows and therefore provide an automatic non-lossy conversion to UTF-8
(or WTF-8).  And since Windows is (as far as I know) the only relevant
OS that uses wide characters as its primary interface, such an entry
point could be provided for all other implementations.


SG16 list run by sg16-owner@lists.isocpp.org