On Fri, Jan 29, 2021, 08:20 Tom Honermann <tom@honermann.net> wrote:
On 1/28/21 1:57 PM, Jens Maurer via SG16 wrote:
On 28/01/2021 19.37, Corentin via SG16 wrote:
On Thu, Jan 28, 2021 at 7:22 PM Peter Brett <pbrett@cadence.com <mailto:pbrett@cadence.com>> wrote:

    I think the big problem here is trying to make it a template.____

    __ __

    Make it named.  It’s literally not possible to use this correctly in generic code.

Question then is do we want to solve the issue for wchar_t?
Because having the name of the encoding in the function kinda precludes that - the sizeof(wchar_t) being platform dependant
You only get away with  char* -> char8_t* because "char" has special
aliasing exceptions.

You'll get the full set of aliasing concerns for
  wchar_t* -> char16_t* or char32_t*

I think what we're looking for is a portable solution for this ICU hack (generalized to make it work for [unsigned] char* conversion to char8_t*); the goal being to enable some form of explicit restricted pointer interconvertibility between same sized/aligned types.

I don't understand the ICU hack sufficiently well to relate it to a memory or object model.  I'm also not sure that it actually works (though it may suffice for the scenarios that are encountered in practice).

Perhaps something like this would suffice.

template<typename To, typename From>
requires requires {
    requires std::is_trivial_v<To>;
    requires std::is_trivial_v<From>;
    requires sizeof(To) == sizeof(From);
    requires alignof(To) == alignof(From);
To* alias_barrier_cast(From *p) {
    asm volatile("" : : "rm"(p) : "memory");
    return reinterpret_cast<To*>(p);

I don't think we want to generalize to all trivial types which are not characters types.
i would like to find out is whether the problem we know exists with char/char8_t also exists for wchar_t/char16_t (or char32_t, although wchar_t is of limited uses on platforms where its 32 bits).