sg12: [ub] Aliasing char16_t with int_least16

From: Jeffrey Yasskin <jyasskin_at_[hidden]>
Date: Wed, 30 Oct 2013 10:14:04 -0700

I was sent a code review today that wanted to pass an array of wchar_t
(sizeof(wchar_t)==2 on Windows) to a function taking const uint16_t*
(https://code.google.com/p/chromium/codesearch/#chromium/src/third_party/harfbuzz-ng/src/hb-buffer.cc&l=982).
The proposed code did this with "reinterpret_cast<const
uint16_t*>(the_wchar_t_pointer)", but I had to point out that this
violates [basic.lval]p10. The workarounds seem to involve either
copying the array or adding overloads to the function that pass
through to a template.

Can we make this sort of aliasing defined instead? With 2-3 ways to
represent a utf-16 array, we're likely to see more undefined casting
as users try to avoid extra copies or perceived code bloat.

I think the change would be to add some bullets in [basic.lval]p10:
* [a type that is] the (possibly cv-qualified) underlying type of the
dynamic type of the object,
* [a type that is] the (possibly cv-qualified) signed or unsigned type
corresponding to the underlying type of the dynamic type of the
object,

Would we want to go the other way too? That is, do we want to force
everyone writing a flexible utf-16 function to take uint16_t, or could
they accept char16_t too? If we want to let them take char16_t, we'd
need to add:
* a (possibly cv-qualifed) type whose underlying type is the dynamic
type of the object
* a (possibly cv-qualifed) type whose underlying type is the signed or
unsigned type corresponding to the dynamic type of the object

Jeffrey

Received on 2013-10-30 18:14:26