On Wed, Sep 8, 2021 at 1:08 PM Corentin <corentin.jabot@gmail.com> wrote:

On Wed, Sep 8, 2021 at 6:50 PM Jens Maurer <Jens.Maurer@gmx.net> wrote:
On 08/09/2021 18.08, Hubert Tong via SG16 wrote:
> As it is, I think it is worthwhile to revisit whether the generality of the implementation-defined behaviour is advisable. It seems that, as the paper evolved, at least one implementation-injected alias was meant to be the "preferred name" on the system returned or recognized by various APIs (e.g., iconv_open). Even that is problematic: There is a tendency in converter applications to treat a de facto "reigning" extension as being what is meant when the non-extended standard is requested. In highly architected environments, the csShiftJIS and csWindows31J "problem" that is present in ICU would manifest as there being only one API-recognized "preferred name". The present design intent of P1885 in having non-overlapping sets of aliases is in conflict with the desire to associate the "preferred name" as an alias in such situations.

You seem to be saying that the preferred name for both csShiftJIS and
csWindows31J is supposed to be "Shift-JIS" (or so), but an alias is
supposed to be globally unique under P1885.

Yes, aliases and primary names are globally unique in P1885
As such an implementation that would return Windows31J for "Shift-JIS'' would not be valid.
Note that in a few cases, it is preferable to know both the name and the platform to derive an exact transcoding table.

The implementation-defined aliases permission does not allow for duplication.
It exists in case implementers want to add data from other sources like POSIX locale files or whatever GetCPInfoExA would return on windows. to the extent that it would not introduce duplication.

I believe that the "gotcha" on the ability to add data from other sources waters down the desirability of that ability by a lot.

I think Hubert point is that Windows users may not get the result they expect by constructing a text encoding from "Shift-JIS" - If what they want is actually  "Windows-31J", given that most users will refer to "Windows-31J" as "Shift JIS" colloquially. Unfortunately, I am not sure that any amount of API design can make that scenario less confusing.

Insisting that "Shift-JIS" is ambiguous and making the user disambiguate from a selection of choices is a possible direction to resolving this case; however, I believe an API design that is less ambitious could also make similar scenarios less confusing: That is, if the set of registered character sets and their associated properties are strictly a representation of the IANA character set registry.