C++ Logo


Advanced search

Re: Feedback on D2513R1: char8_t Compatibility and Portability Fix

From: JeanHeyd Meneide <phdofthehouse_at_[hidden]>
Date: Fri, 11 Feb 2022 01:34:21 -0500
Dear Jens and Hubert,

On Thu, Feb 10, 2022 at 7:35 PM Hubert Tong <
hubert.reinterpretcast_at_[hidden]> wrote:

> On Thu, Feb 10, 2022 at 7:28 PM Jens Maurer <Jens.Maurer_at_[hidden]> wrote:
>> On 11/02/2022 01.13, Hubert Tong via SG16 wrote:
>> > Link to paper: https://thephd.dev/_vendor/future_cxx/papers/d1967.html
>> <https://thephd.dev/_vendor/future_cxx/papers/d1967.html>
>> > …
>> > The wording does not address the case of UTF-8 code units in the value
>> of the u8 string literal with numerical values that exceed the range of
>> values representable by signed char.
>> This might be a pre-existing defect, but we don't address that case for a
>> regular
>> string literal (on a platform where char happens to be unsigned), either.
> I think the time-honoured tradition of having people address at least some
> defects while "in the area" is reasonable to apply here.

    I think what matches existing practice is to use the rules from
[conv.integral] (http://eel.is/c++draft/conv.integral#3) and do the normal
2^N modulo arithmetic conversion. The wording might read like this,
potentially (note that I have very little idea if I'm doing this correctl):

Additionally, an array of ordinary character type may be initialized by a
> UTF-8 string literal, or by such a string literal enclosed in braces.
> Successive characters of the value of the string-literal initialize the
> elements of the array <INS-NEW>and perform integral conversion
> [conv.integral] if necessary</INS-NEW>.

Is that alright?


Received on 2022-02-11 06:34:36