C++ Logo

sg16

Advanced search

Re: [SG16] Bike shedding for Christmas: P1885 Naming Text Encodings

From: Thiago Macieira <thiago_at_[hidden]>
Date: Sun, 05 Jan 2020 08:40:03 -0300
On Saturday, 4 January 2020 19:55:21 -03 Corentin Jabot wrote:
> On Sat, 4 Jan 2020 at 22:20, Thiago Macieira <thiago_at_[hidden]> wrote:
> > enum class mib : uint32_t {
> > // names match
> > // https://www.iana.org/assignments/ianacharset-mib/ianacharset-mib
> > other = 1,
> > unknown = 2,
> > csASCII = 3,
> > csISOLatin1 = 4,
> > csUTF8 = 106,
> > csUTF16BE = 1013,
> > csUTF16LE = 1014,
> > csUTF16 = 1015,
> > csUTF32BE = 1017,
> > csUTF32LE = 1018,
> > csUTF32 = 1019
> > };
> >
> > However, a more powerful way for comparison would be to have a
> > text_encoding_id class that can compare to mib and to itself. It would be
> > able
> > to tell whether two unlisted (and possibly unregistered) encodings are the
> > same, whereas mib can possibly fail. This tex_encoding_id class can have a
> > mib() accessor that returns a mib number, but may return mib::other.
> > Hence,
> > direct mib comparison should be discouraged in favour of text_encoding_id.
>
> This is *exactly* what is proposed.
> They compare equal if:
> * they have tyhe same mib
> * they have the other mib and their name compare equal (under the
> comparison algorithm ignoring case dash and a few other things)

Maybe with the same effect, but my specification would be that the
text_encoding_id has an internal representation that is looked up or
calculated on creation and that's what's compared, not the MIB or text name.
That implies that an implementation is not required to compare equal any ID it
doesn't know about. The only required ones are the names and aliases as
currently defined by IANA of the mandatory character sets as listed above.

That has the side-effect that when
        cs1.mib() == mib::unknown
        cs2.mib() == mib::unknown
then cs1 == cs2, regardless of how cs1 and cs2 were created. That means on
some implementations, text_encoding_id("WTF-8") == text_encoding_id("SJIS").
-- 
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
   Software Architect - Intel System Software Products

Received on 2020-01-05 05:42:35