C++ Logo

sg16

Advanced search

Re: P2749 (down with "character") and P2736 (Referencing the Unicode Standard) updates

From: Hubert Tong <hubert.reinterpretcast_at_[hidden]>
Date: Tue, 7 Feb 2023 14:56:36 -0500
Thanks Corentin.

I've reviewed the draft R2 version I just got within the hour.

Minor: Revision history ordering is weird: R0 R2 R1

The "or" in the note where the control aliases was removed is in the wrong
place.

There is a green "implementation defined" without a hyphen. A hyphen was
requested by CWG.

-- HT

On Tue, Feb 7, 2023 at 2:50 AM Corentin <corentin.jabot_at_[hidden]> wrote:

> Hey Hubert,
>
> I've applied the changes.
> I don't necessarily agree with your first point (or rather I don't agree
> that the added precision is likely to enlighten readers), but I'm happy to
> go with it knowing "down with character" will resolve these questions.
> The rest makes sense - and Tom reported that second issue too.
> Sorry I didn't see your mail earlier
>
>
> Corentin
>
> On Wed, Feb 1, 2023 at 5:32 PM Hubert Tong <
> hubert.reinterpretcast_at_[hidden]> wrote:
>
>> Hi Corentin:
>>
>> I have comments on https://isocpp.org/files/papers/P2736R1.pdf (version
>> at the time of this writing).
>>
>> (0)
>> The document number still shows as "D2736R1" :)
>>
>> (1)
>> With respect to the changes to [lex.charset], the translation character
>> set was previously a coded character set (the code space of which was
>> congruent with Unicode), but is now a character set (whose elements are
>> abstract characters).
>>
>> The "decoded to produce a sequence of [ ... ] scalar values that
>> constitutes the sequence of elements of the translation character set"
>> wording no longer works as-is because it is the wording equivalent of
>> reinterpret_cast<const TranslationCharacters *>(ScalarValues).
>>
>> Suggestion:
>> "decoded to produce a sequence of Unicode scalar values. A sequence of
>> translation character set elements is then formed by mapping each Unicode
>> scalar value to the corresponding translation character set element."
>>
>> (2)
>> The separation of "control code aliases" from "character name aliases"
>> has been removed. Remove "control code aliases" from the note where it
>> still appears.
>>
>> (3)
>> In the following, "in any of the" is not correct because the set of
>> Unicode encoding forms is unbounded. Replace with "in any".
>> [ Note: No character lacks representation in any of the Unicode encoding
>> forms. — end note ]
>>
>> I hope this helps.
>>
>> Thanks,
>>
>>
>> Hubert Tong
>>
>> On Thu, Jan 26, 2023 at 4:42 AM Corentin via SG16 <sg16_at_[hidden]>
>> wrote:
>>
>>> Hey folks:
>>>
>>> I published https://isocpp.org/files/papers/P2736R1.pdf - with the
>>> changes requested by SG16 yesterday as part of the forwarding poll.
>>> Interesting change were
>>>
>>> * to constantly mention UAX XX of The Unicode Standard
>>> * __STDC_ISO_10646__ : An integer literal of the form yyyymmL (for
>>> example, 199712L). If this symbol is defined, then its value is
>>> implementation-defined
>>> * Specifically mention UTF-8, UTF-16 and UTF-32 instead of Unicode
>>> encoding
>>>
>>> https://isocpp.org/files/papers/D2749R0.pdf down with character:
>>>
>>> - Remove the footnote about old linkers
>>> - Apply the character -> codepoint changes to the annexes and [diff]
>>> sections
>>> - remove a stale cross reference in phase 1 of translations
>>> - various typos
>>>
>>>
>>> Thanks,
>>>
>>> Corentin
>>> --
>>> SG16 mailing list
>>> SG16_at_[hidden]
>>> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>>>
>>

Received on 2023-02-07 19:57:05