C++ Logo

sg16

Advanced search

[SG16] Feedback on D2460R5: UTF-16 is standard practice

From: Tom Honermann <tom_at_[hidden]>
Date: Wed, 6 Oct 2021 14:45:48 -0400
A couple of comments/questions on the revision of D2460R5
<https://isocpp.org/files/papers/D2460R0.pdf> (UTF-16 is standard
practice) that we will be reviewing in today's SG16 telecon.

 1. The second wording change is actually against
    [character.seq.general]p1
    <http://eel.is/c++draft/character.seq.general#1>, not [character.seq]p1.
 2. The change to [character.seq.general]p1
    <http://eel.is/c++draft/character.seq.general#1> is missing the "--
    end note ]" to complete the (unchanged) note. Actually, it looks
    like the text from the first bullet point got pasted at this
    location. It looks like that paste broke the bullets as well; the
    text beginning with "A letter is any of ..." should be the third
    bullet point and the next sentence beginning with "The decimal-point
    character ..." should be the fourth.
 3. The change to [character.seq.general]p1
    <http://eel.is/c++draft/character.seq.general#1> retains the ", but
    may change ..." text at the end that was removed by P2314R4
    <https://wiki.edg.com/pub/Wg21virtual2021-10/StrawPolls/p2314r4.html>.
 4. I don't understand the motivation for the addition of "All elements
    of the execution wide-character set are encoded as a single code
    unit representable by a value of type wchar_t." to
    [character.seq.general]p1
    <http://eel.is/c++draft/character.seq.general#1>. That seems to
    effectively reintroduce the same requirement removed from
    [basic.fundamental]p8 <http://eel.is/c++draft/basic.fundamental#8>.
 5. I recommend changing the title to better explain what the paper
    proposes. For example, "Allow wide character encodings to be
    variable length encodings like UTF-16 to match existing practice".

Tom.


Received on 2021-10-06 13:45:52