C++ Logo

sg16

Advanced search

Re: [SG16] Execution encoding and the execution environment on Windows systems

From: Tom Honermann <tom_at_[hidden]>
Date: Tue, 5 Jan 2021 12:08:41 -0500
On 1/5/21 8:46 AM, Thiago Macieira via SG16 wrote:
> On Tuesday, 5 January 2021 02:53:54 -03 Tom Honermann via SG16 wrote:
>> 8. We *could* specify a new portable entry point (a new main()
>> signature or an alternative to main()) that provides UTF-8 encoded
>> command line arguments (presumably with substitution characters in
>> place of non-transcodeable content).
> Using an alternate entry point is already possible and resolves the command-
> line problem: just use _wmain() and UTF-16.

Yes, but that is Windows specific. What I had in mind was a new entry
point that portably provides a UTF-8 encoded command line, perhaps simply:

    int main(int argc, char8_t **argv) { ... }

Or perhaps with "raw" access to the command line in some way; perhaps
similarly to std::filesystem::path:

    namespace std {
       class argument {
         public:
    std::string_view string() const;
    std::wstring_view wstring() const;
    std::u8string_view u8string() const;
    std::u16string_view u16string() const;
    std::u32string_view u32string() const;
       };
    }

    int main(std::initializer_list<std::argument> args) {... }

(Arguments not withstanding regarding how to provide the argument range,
avoid startup overhead, handle memory allocation, whether the command
line should be accessible outside of main (e.g., __argv), whether the
arguments should be mutable, whether the argument range should be a
mutable container, which encodings to provide, what the return type of
the converters should be, etc...)

> The pipe problem is not local to
> the current process, but the *other* one, so it can't be fixed locally.
>
Indeed, a new entry point doesn't address this issue at all; I went on a
tangent at the end there. The point there was that we can assume the
availability of a (possibly ill-formed) UTF-16 encoded command line on
Windows and therefore provide an automatic non-lossy conversion to UTF-8
(or WTF-8). And since Windows is (as far as I know) the only relevant
OS that uses wide characters as its primary interface, such an entry
point could be provided for all other implementations.

Tom.


Received on 2021-01-05 11:08:44