C++ Logo


Advanced search

Re: [SG16] P1885: Naming text encodings: Curation and provenance of aliases

From: Corentin <corentin.jabot_at_[hidden]>
Date: Wed, 8 Sep 2021 19:08:07 +0200
On Wed, Sep 8, 2021 at 6:50 PM Jens Maurer <Jens.Maurer_at_[hidden]> wrote:

> On 08/09/2021 18.08, Hubert Tong via SG16 wrote:
> > As it is, I think it is worthwhile to revisit whether the generality of
> the implementation-defined behaviour is advisable. It seems that, as the
> paper evolved, at least one implementation-injected alias was meant to be
> the "preferred name" on the system returned or recognized by various APIs
> (e.g., iconv_open). Even that is problematic: There is a tendency in
> converter applications to treat a de facto "reigning" extension as being
> what is meant when the non-extended standard is requested. In highly
> architected environments, the csShiftJIS and csWindows31J "problem" that is
> present in ICU would manifest as there being only one API-recognized
> "preferred name". The present design intent of P1885 in having
> non-overlapping sets of aliases is in conflict with the desire to associate
> the "preferred name" as an alias in such situations.
> You seem to be saying that the preferred name for both csShiftJIS and
> csWindows31J is supposed to be "Shift-JIS" (or so), but an alias is
> supposed to be globally unique under P1885.

Yes, aliases and primary names are globally unique in P1885
As such an implementation that would return Windows31J for "Shift-JIS''
would not be valid.
Note that in a few cases, it is preferable to know both the name and the
platform to derive an exact transcoding table.

The implementation-defined aliases permission does not allow for
It exists in case implementers want to add data from other sources like
POSIX locale files or whatever GetCPInfoExA would return on windows. to the
extent that it would not introduce duplication.

I think Hubert point is that Windows users may not get the result they
expect by constructing a text encoding from "Shift-JIS" - If what they want
is actually "Windows-31J", given that most users will refer to
"Windows-31J" as "Shift JIS" colloquially. Unfortunately, I am not sure
that any amount of API design can make that scenario less confusing.

> Jens

Received on 2021-09-08 12:08:22