sg16: Re: [SG16-Unicode] Do we really need basic_text

From: Tom Honermann <tom_at_[hidden]>
Date: Sun, 5 Aug 2018 15:42:03 -0400

Adding the SG16 mailing list back in...

> On 08/04/2018 08:45 PM, Lyberta wrote:
> Tom Honermann:
>> I don't think redefining I/O in terms of std::byte would help solve text
>> related problems. For console based programs, stdin and stdout will
>> continue to have an associated encoding that is necessarily determined
>> (for interoperability purposes) by the environment the program is
>> running in. We could, of course, design an I/O library that implicitly
>> transcodes from the externally determined encoding to a program
>> determined internal encoding. Whether that would be a good thing to do
>> or not is not something I've developed strong opinions about yet. There
>> are significant challenges here since native I/O on most platforms uses
>> the execution character encoding, but Windows' native I/O uses the wide
>> execution character encoding (narrow interfaces implicitly transcode; in
>> ways that don't always work as expected). Bridging these differences
>> may require defining a "native" or "system" encoding that is used for
>> stdin, stdout, environment variables, command line options, etc...
>> Separate encodings may be necessary for file names and text file
>> contents since those may differ from other I/O.
> We only really need functions to convert from ECS and WECS to Unicode.
> The rest is trivial and can be hidden from the user.

I agree we need those transcoding functions. An outstanding question is, do we provide them as additional codecvt specializations (somehow dealing with the fact that some of the specializations we would want for this are currently taken for UTF-8 conversions (the ones proposed for deprecation in P0482))? Or do we introduce new transcoding functions that are less awkward to use?

If you feel the rest is trivial, I urge you to write a proposal for what you have in mind.

Tom.

Received on 2018-08-05 21:42:07