C++ Logo

sg16

Advanced search

Re: [SG16] Bike shedding for Christmas: P1885 Naming Text Encodings

From: Thiago Macieira <thiago_at_[hidden]>
Date: Sat, 04 Jan 2020 11:12:12 -0300
On Tuesday, 31 December 2019 20:49:00 -03 Tom Honermann via SG16 wrote:
> The benefit is that including all of them avoids the problem of
> implementors offering extensions with inconsistent or conflicting
> names. It also doesn't put us in the position of deciding which
> encodings are "important". IANA provides a good specification to
> follow. I don't think we should be subsetting, at least not without
> some clear criteria for determining which encodings make the cut. For
> example, I suspect Shift-JIS gets more use than UTF-32, but the former
> is not included in the proposal and the latter is.

I say we should bite that bullet and make the list of what we think should be
available as constexpr-time constants: 7-bit US-ASCII, Latin1, UTF-8, UTF-16,
UTF-32. Plus something to indicate "System" (locale encoding). No more.

If someone needs a full listing, they can take the XML file from IANA and
generate their list converting from MIB to all names and aliases. The C++
standard doesn't need to do that for them.

Besides, what are the IDs to be used for? To access an encoder and a decoder?
Aside from those 6 above, everything else, including unregistered names,
aliases and extensions, is optional to all implementations and must be checked
for at runtime. Relying on them existing is not portable.

-- 
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
   Software Architect - Intel System Software Products

Received on 2020-01-04 08:14:45