On Thu, 2 Jul 2020 at 09:04, Jens Maurer via Core <core@lists.isocpp.org> wrote:
On 02/07/2020 07.39, Tom Honermann wrote:
> On 7/1/20 3:28 AM, Jens Maurer wrote:
>> Maybe replace "associated character encoding" -> "associated literal encoding"
>> globally to avoid the mention of "character" here.
> Despite the use of the "C" word, "character encoding" is more consistent
> with Unicode terminology. Though if we really want to be consistent, we
> should use "character encoding form" (which ISO/IEC 10646 then calls
> simply "encoding form"). This is something we could discuss at the SG16
> meeting next week.
The paper is in CWG's court; involving SG16 is not helpful at this stage
absent more severe concerns that would involve sending back the paper
as a CWG action. That said, everybody (including members of SG16) are
invited to CWG telecons to offer their opinion.
off-topic remarks: Since we'd be using a new term such as
"literal encoding" here, I don't think Unicode will get into our
way. I'd like to point out that "character encoding" (also in the
Unicode meaning) sounds like a character-at-a-time encoding, which
we expressly don't want to require. So, choosing a different term
than one that has Unicode semantic connotations seems wise.
literal encoding is a less ambiguous term either way.We need a terminology such that we can distinguish the encoding of literals from that of runtime strings, literal (associated) encoding achieves that.
Ah, I think we may be crossing hairs here. I agree that we
should have an abstract name that indicates the encoding used for
literals. We lack a term for that today (which is why the paper
uses the phrase "encoding of the execution character set"). But
that is different from what is intended by "associated character
encoding"; this is intended to name an encoding (possibly
indirectly, hence "encoding of the ...") that might be registered
with IANA
(where the term "character set" is used to mean "character
encoding", but the former is used for legacy reasons).
Tom.