C++ Logo

sg16

Advanced search

Re: CWG 2779: Restrictions on the ordinary literal encoding

From: Steve Downey <sdowney_at_[hidden]>
Date: Mon, 31 Jul 2023 10:44:56 -0400
Possibly distribute the 'distinct values of type' across this disjunction?
That is, rule out the possibility of mixing unsigned and signed
representations.
distinct values of type signed char or distinct values of type unsigned
char.

I'm not sure which we consider "more fundamental" so to speak, char or the
signed/unsigned versions.

On Mon, Jul 31, 2023 at 9:36 AM Tom Honermann via SG16 <
sg16_at_[hidden]> wrote:

> The suggested resolution for CWG 2779 (Restrictions on the ordinary
> literal encoding) <https://cplusplus.github.io/CWG/issues/2779.html>
> modifies [lex.charset]p8 <https://eel.is/c++draft/lex.charset#8> as
> follows:
>
> ... The *ordinary literal encoding* is the implementation-defined
> encoding applied to an ordinary character or string literal; its code
> unit values are required to be representable as distinct values of type signed
> char or unsigned char. The *wide literal encoding* is the
> implementation-defined encoding applied to a wide character or string
> literal; its code unit values are required to be representable as
> distinct values of type wchar_t.
>
> I am uncertain about the proposed "signed char or unsigned char" wording.
> I think the intent is to state that each code unit value must have a
> distinct representation in char regardless of its underlying type.
> However, the wording could be read as permitting (distinct) values in the
> range [SCHAR_MIN, UCHAR_MAX]. I think it would be simpler to restrict
> representation to char with an implicit reliance on [basic.fundamental]p6
> <http://eel.is/c++draft/basic.fundamental#6> for mapping code unit values
> from the signed char and unsigned char ranges to representable values in
> char.
>
> ... The *ordinary literal encoding* is the implementation-defined
> encoding applied to an ordinary character or string literal; its code
> unit values are required to be representable as distinct values of type signed
> char or unsigned char. The *wide literal encoding* is the
> implementation-defined encoding applied to a wide character or string
> literal; its code unit values are required to be representable as
> distinct values of type wchar_t.
>
> If desired, we could add a note to clarify the relationship to the signed
> char and unsigned char value ranges.
>
> Tom.
> --
> SG16 mailing list
> SG16_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg16
>

Received on 2023-07-31 14:45:10