C++ Logo

sg16

Advanced search

[SG16] Feedback re: P1885R5: Naming Text Encodings

From: Hubert Tong <hubert.reinterpretcast_at_[hidden]>
Date: Mon, 26 Jul 2021 10:59:52 -0400
Apologies for sending this so close to the start of the telecon.

The paper refers to RFC 3808 which established the "IANA Charset MIB" and
specifically says:

> However, [rfc3808] is from 2004 and has not been updated.
>

The paper should at least make some reference to the location of the
location where the "IANA Charset MIB" is maintained:
https://www.iana.org/assignments/ianacharset-mib/ianacharset-mib. As of the
document date indicated in P1885R5, the MIB module was last updated in
January of the current year (2021). It seems this location was mentioned in
passing on the SG16 reflector but with no emphasis on the significance.

Also, I find the choice of naming the accessor that produces the MIBenum
value from the IANA Charset Registry "mib" to be unfortunate. It seems
there are unambiguous precedent cases in libraries (including those for
other programming languages) using "mib" to refer to MIBenum values, so
this does not rise to the level of an objection. I see no reason for the
paper to use the term this way in prose though. I do note that RFCs do seem
to use the term not only to refer to MIBs but also to MIB modules (but a
MIBenum value is neither a MIB nor a MIB module).

Regarding the naming of the enumerator values, I am not fond of excess
"invention" here. There are names (beginning with "cs") in the reference
documents. Using those names (including the "cs" prefix) makes even the
"csUnicode" case "merely following established practice".

Regarding the underlying type of the enumeration: The corresponding
definition in RFC 3808 uses ASN.1 INTEGER (which does not have a length
limit).

Regarding the "environment" functions, I think the wording needs more work
to address cases where the value of the LANG environment variable is
changed.

Received on 2021-07-26 10:00:25