sg16: Re: [SG16] Draft proposal: Clarify guidance for use of a BOM as a UTF-8 encoding signature

From: Jens Maurer <Jens.Maurer_at_[hidden]>
Date: Fri, 16 Oct 2020 20:58:16 +0200

On 16/10/2020 17.44, Thiago Macieira via SG16 wrote:
> On Tuesday, 13 October 2020 22:16:14 PDT Tom Honermann via SG16 wrote:
>>> Everyone already knows the best practice: “Use UTF-8”. Any
>>> resources/effort is going to be getting toward that best practice, not
>>> edge cases of legacy behaviors that are offshoots of something that
>>> isn’t the desired end state of “use UTF-8”.
>>
>> My goal is exactly to ease migration to that end state. We can't
>> reasonably synchronize a migration of all C++ projects to UTF-8. To get
>> to that end state, we'll have to enable C++ projects to independently
>> transition to UTF-8.
>
> I think we can. We just need critical mass.
>
> The status quo has remained because there has been nothing forcing a change to
> status quo. Yes, there's a lot of old codebase that, for example, might have
> comments written in Chinese or Finnish or something else. But nothing has
> forced those to update. If the critical mass of software is UTF-8, that will
> force those codebases to recode. And unlike Microsoft's fixing of their own
> SDK header files to comply with the language, this is a simple recode
> operation. It can be done by downstream users, with little to no danger.

Such a recode might be easier for some and harder for others,
depending on which older versions of compilers need to be
supported or other environmental factors, possibly beyond the
immediate control of the developer or project.

I don't believe we have sufficient insight into C++ code at large
in WG21, let alone SG16, so let's be careful with statements assuming
that people will be "forced" to do something.

Jens

Received on 2020-10-16 13:58:43