C++ Logo

sg16

Advanced search

NB comment review: FR 5.3 [lex.charset] Replace "translation character set" by "Unicode"

From: Tom Honermann <tom_at_[hidden]>
Date: Wed, 26 Oct 2022 11:56:29 -0400
Please review the following. If you agree with the proposed change and
have no further information to add, then there is no need to respond. If
you disagree with the proposed change, have corrections or new
information to offer, or have comments on the candidate polls, then
*please reply by Monday, October 31st*.


  FR 5.3 [lex.charset] <http://eel.is/c++draft/lex.charset> Replace
  "translation character set" by "Unicode"

GitHub nbballot issue #422
<https://github.com/cplusplus/nbballot/issues/422>.


    Comment:

C++23 introduces the term "translation character set" to designate
Unicode scalar values. This new term is C++ specific and has no benefit
over the terms scalar value or codepoints (both can be used
interchangeably as surrogates are not permitted after phase 1 of
translation). Because other terms exist, and because making characters
up for non-assigned codepoints doesn't match any possible definition of
the term "character", we would like to the term "translation character
set" replaced by "Unicode" and "elements of the translation character
set" replaced by codepoint or scalar value. In places in [lex] where the
term character is used to mean "codepoint", it should be replaced by
"codepoint".


    Proposed change:

<blank>


    SG16 chair notes:

These concerns were discussed as part of the reviews of P2314
<https://wg21.link/p2314> and P2297 <https://wg21.link/p2297> during the
2021-03-24 SG16 telecon
<https://github.com/sg16-unicode/sg16-meetings/blob/master/README-2021.md#march-24th-2021>.
The following poll was taken at that time. The comment does not appear
to present new information.

  * Poll: Introduce the concept of a 'translation character set' which
    synthesizes characters for unassigned UCS scalar values.
      o Attendance: 9
      o
        SF
         F
         N
         A
         SA
        2
         4
         1
         0
         1

      o Consensus is in favor.
      o SA: The "translation character set" abstraction is unnecessary
        and the definition uses terminology incorrectly.


    Candidate polls:

  * [FR-XX]: SG16 recommends accepting the comment in the direction
    suggested in the comment text.
  * [FR-XX]: SG16 recommends rejecting the comment as not a defect but
    encourages exploration of changes that would enable elimination of
    the "translation character set" terminology in a future standard.
  * [FR-XX]: SG16 recommends rejecting the comment as not a defect.

Tom.

Received on 2022-10-26 15:56:32