C++ Logo

SG16

Advanced search

Subject: Re: Bike shedding for Christmas: P1885 Naming Text Encodings
From: Corentin Jabot (corentinjabot_at_[hidden])
Date: 2020-01-05 06:56:29


On Sun, 5 Jan 2020 at 13:28, Thiago Macieira <thiago_at_[hidden]> wrote:

> On Sunday, 5 January 2020 09:12:35 -03 Corentin Jabot wrote:
> > The way I have done it:
> > If there is a name which is not known from the implementation the mib
> will
> > be `other` rather than unknown
> > text_encoding with unknown mib have no name (if two text_encoding are
> > unknown they compare equal)
>
> Sorry, I disagree. If the implementation doesn't know this encoding, then
> by
> definition it's "unknown". "other" should only be used for encodings it
> knows
> about but which are not registered with IANA.

I am not sure I see the value in that?
It would mean the implementation needs to maintain a list of non registered
encodings it knows about (which my implementation doesn't do)\
And then we have 3 states : unknown, other, invalid. I am not sure
differentiating unknown and invalid is pertinent?

>

> > if you construct a text encoding with an unregistered or other wise name
> > not known from the implementationm, it will have the "other" mib
> > ie:
> >
> > text_encoding_id id("Rubbish");
> > id.mib() == text_encoding_id::other;
> > id.name() == "Rubbish"
>
> I wouldn't do that. And if mib() == unknown, name() is not required to
> return
> the original string or anything that was deduced from it.
>
> I really do not see the point in text_encoding_id being able to handle
> encodings the implementation doesn't know about. It's never going to be
> able
> to encode to them or decode from them. It won't know what the official
> name
> should be to write a Content-Type MIME header line.
>

There is right know no relation whatsoever between my proposal and
encoding/decoding facilities
It is _just_ a name

>
> > My interpretation of the rfc is that the mib unknown is meant to to mean
> > "we cannot tell you what the mib is" (maybe you asked for the console
> > encoding and there is no console on this device or no api to query it),
> > rather than "this this is not registered" (which is what other is used
> for)
>
> Right. "other" is used when the implementation knows the encoding but not
> its
> MIB (because it isn't registered).
>
> Note also the value 0 is left unused and could be used to stand for
> "invalid".
>
> > > That implies that an implementation is not required to compare equal
> any
> > > ID it doesn't know about.
> >
> > Currently, it can not even be constructed with a mib it doesn't know
> about.
> > Either: It knows about the encoding or the mib will be other. There is no
> > scenario in which you can construct a text_encoding with a mib which it
> > doesn't know about.
> > This is why I do not want to have a constructor taking an arbitrary mib.
> > If that is actually useful, I'd rather have a static method which either
> > return optional<text_encoding> or a text_encoding with mib unknown in
> case
> > of the mib is not known of the implementation
>
> I'm not talking about creating from a MIB, but instead about creating from
> a
> text name.

> And I don't see how you can prohibit an arbitrary MIB. If you allow a
> constructor taking mib::csUTF8, then you allow the same constructor taking
> mib(5). The only thing you can do is say that's implementation-defined.
>

I do not allow that - for all the reasons you mentioned

> text_encoding_id(mib(5)).mib() could be mib(5) or mib::unknown.
>
> --
> Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
> Software Architect - Intel System Software Products
>
>
>
>



SG16 list run by herb.sutter at gmail.com