C++ Logo

sg12

Advanced search

Re: [ub] Aliasing char16_t with int_least16_t, etc.

From: Lawrence Crowl <Lawrence_at_[hidden]>
Date: Wed, 30 Oct 2013 15:06:35 -0700
On 10/30/13, Jeffrey Yasskin <jyasskin_at_[hidden]> wrote:
> I was sent a code review today that wanted to pass an array of wchar_t
> (sizeof(wchar_t)==2 on Windows) to a function taking const uint16_t*
> (https://code.google.com/p/chromium/codesearch/#chromium/src/third_party/harfbuzz-ng/src/hb-buffer.cc&l=982).
> The proposed code did this with "reinterpret_cast<const
> uint16_t*>(the_wchar_t_pointer)", but I had to point out that this
> violates [basic.lval]p10. The workarounds seem to involve either
> copying the array or adding overloads to the function that pass
> through to a template.
>
> Can we make this sort of aliasing defined instead? With 2-3 ways to
> represent a utf-16 array, we're likely to see more undefined casting
> as users try to avoid extra copies or perceived code bloat.

The undefined behavior permits better anti-aliasing. I do not know
how large the effect is.

>
> I think the change would be to add some bullets in [basic.lval]p10:
> * [a type that is] the (possibly cv-qualified) underlying type of the
> dynamic type of the object,
> * [a type that is] the (possibly cv-qualified) signed or unsigned type
> corresponding to the underlying type of the dynamic type of the
> object,
>
> Would we want to go the other way too? That is, do we want to force
> everyone writing a flexible utf-16 function to take uint16_t, or could
> they accept char16_t too? If we want to let them take char16_t, we'd
> need to add:
> * a (possibly cv-qualifed) type whose underlying type is the dynamic
> type of the object
> * a (possibly cv-qualifed) type whose underlying type is the signed or
> unsigned type corresponding to the dynamic type of the object

Even further, we consider saying that we can access any integral type
with the same size and alignment. But of course, that makes aliasing
even less effective than what you suggest. I do not know how important
that is.

If we do allow such accesses, I suggest either adding another form of
cast or loosening the static_cast requirements, to allow checkable
code like the following.

extern void foo( char16_t* p );
void bar( int_least16_t* q ) { foo( static_cast<char16_t*>(q) ); }

If int_least16_t is not 16 bits, the compilation should fail with
a helpful message.

-- 
Lawrence Crowl

Received on 2013-10-30 23:06:37