C++ Logo

liaison

Advanced search

[wg14/wg21 liaison] Fwd: (SC22WG14.19436) [SG16] Draft WG14 N2653: char8_t: A type for UTF-8 characters and strings (Revision 1)

From: Aaron Peter Bachmann <aaron_ng_at_[hidden]>
Date: Mon, 7 Jun 2021 09:45:38 +0200
Hello!
At least for C if uint8_t is not an alias for char, signed char and
unsigned char is very undesirable.
[u]intxy_t are just typedefs.
Having a special case uint8_t makes C less regular and breaks existing code.
For performance we have restrict.

Regards, Aaron Peter Bachmann

On 6/6/21 5:49 PM, Tom Honermann wrote:
> On 6/5/21 4:27 PM, Victor Yodaiken wrote:
>>
>>
>> On Sat, Jun 5, 2021 at 9:51 AM Tom Honermann <tom_at_[hidden]
>> <mailto:tom_at_[hidden]>> wrote
>>
>>>
>>> Why is it not desirable?
>>
>> I mentioned that there is a tradeoff between code efficiency and
>> safety in the updates made to the paper.
>>
>> maybe im looking at the wrong version but i dont see any discussion
>> of that
>
> From here
> <https://rawgit.com/sg16-unicode/sg16/master/papers/n2653.html#do_char8_t_type>.
> Start reading at the paragraph that begins with "Additional motivation
> for distinct integer types is the ability to specify them as
> non-aliasing types".
>
> > The following example code would be well-formed in C regardless of
> whether char8_t is specified as a new integer type or as a typedef
> name of an existing character type. If char8_t is specified as a
> typedef name of an existing character type, then the example also
> works as expected because it does not violate aliasing rules. However,
> if char8_t is specified as a new integer type, then the example would
> exhibit undefined behavior because an object of type char is accessed
> using the char8_t type (assuming no new special provisions added to
> C17 6.5, Expressions, paragraph 7). *Thus, there is a trade-off
> between code efficiency and safety inherent in how **char8_t**is defined*.
>
>>>
>>> Thank you, good suggestion.
>>>
>>> I updated the "typedef name vs a new integer type"
>>> <https://rawgit.com/sg16-unicode/sg16/master/papers/n2653.html#do_char8_t_type>
>>> section and have now submitted the paper to
>>>
>>>
>>> That seems more useful. What is the purpose of creating more
>>> type restrictions?
>>
>> The usual reasons; a coherent object model enables type based
>> analysis for improved code generation and other forms of static
>> analysis.
>>
>>
>> What makes it coherent to have unnecessary type ub?
>> What evidence shows tbaa improves performance on real programs?
>> My concern is that there is too much ub in the standard already
>
> I'm not going to debate this here as other forums are better suited
> for discussing the pros and cons of the C object model. I'm confident
> WG14 is well positioned to make a decision with regard to this paper.
>
> Tom.
>

Received on 2021-06-07 02:45:50