C++ Logo

SG16

Advanced search

Subject: Re: Bike shedding for Christmas: P1885 Naming Text Encodings
From: Thiago Macieira (thiago_at_[hidden])
Date: 2020-01-04 08:12:12


On Tuesday, 31 December 2019 20:49:00 -03 Tom Honermann via SG16 wrote:
> The benefit is that including all of them avoids the problem of
> implementors offering extensions with inconsistent or conflicting
> names. It also doesn't put us in the position of deciding which
> encodings are "important". IANA provides a good specification to
> follow. I don't think we should be subsetting, at least not without
> some clear criteria for determining which encodings make the cut. For
> example, I suspect Shift-JIS gets more use than UTF-32, but the former
> is not included in the proposal and the latter is.

I say we should bite that bullet and make the list of what we think should be
available as constexpr-time constants: 7-bit US-ASCII, Latin1, UTF-8, UTF-16,
UTF-32. Plus something to indicate "System" (locale encoding). No more.

If someone needs a full listing, they can take the XML file from IANA and
generate their list converting from MIB to all names and aliases. The C++
standard doesn't need to do that for them.

Besides, what are the IDs to be used for? To access an encoder and a decoder?
Aside from those 6 above, everything else, including unregistered names,
aliases and extensions, is optional to all implementations and must be checked
for at runtime. Relying on them existing is not portable.

-- 
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
   Software Architect - Intel System Software Products

SG16 list run by herb.sutter at gmail.com