C++ Logo


Advanced search

Subject: Re: Bike shedding for Christmas: P1885 Naming Text Encodings
From: Steve Downey (sdowney_at_[hidden])
Date: 2019-12-27 17:27:53

Whatwg favors Encoding rather than charset, and and Encoding has an
associated Encoder and Decoder.

They focus on to and from Unicode scalar values, but since essentially all
interesting encodings are defined in terms of 10646 that may not be a
particular issue.

On Fri, Dec 27, 2019, 06:28 Corentin Jabot via SG16 <sg16_at_[hidden]>

> Hello
> In P1885, I introduce the name "text_encoding" for the class representing
> the name of a text encoding.
> I wonder whether that might conflict or interfere with actual
> encoding/decoder classes and would like your opinion.
> Here are a few possible names:
> * Charset (IANA nomenclature, posix)
> * text_codec (Qt)
> * text_encoding
> * text_encoding_name (encoding is used by posix / python /
> Unicode nomenclature would favor encoding (Unicode is a charset of which
> utf-8 and utf-16 are both are encodings)
> if text_encoding remains the name of that class, encoder/decoder can be
> used for the class doing the actual conversions.
> I will further rename "system" to "environment" to be more generic and
> aligned with POSIX.
> (user, environment and system are, for our purpose synonym and intended to
> mean "the encoding assumed and expected by whatever launched our program).
> Environment has the added benefit that it implies neither user or systems
> which makes it more friendly to embedded platforms
> Thanks for your input,
> Corentin
> --
> SG16 mailing list
> SG16_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16

SG16 list run by sg16-owner@lists.isocpp.org