Date: Sat, 17 Jan 2026 17:07:44 +0100
>
> But what's the point of leaving those bits indeterminate, even in a byte
> array? What can one do to them?
>
One can and should ignore them basically. There are existing ways to type
pun long double to __int128. For example, bit-cast to a byte array, clear
the upper 6 bytes by hand, and then bit-cast a second time to __int128.
It's ugly, but it works.
It would be a bug to read those upper 6 bytes, and making them
indeterminate allows UBSan to catch the bug.
> As in my reply as feedback: the compiler can avoid emitting the
> bit-clearing
> code if it can tell you did not observe the cleared bits. For example:
>
> uint64_t mantissa_of(long double ldbl)
> {
> return std::bit_cast<unsigned __int128>(ldbl);
> }
>
> This returns the 64 bits of the mantissa of an 80-bit extended precision
> long
> double, which never contain any padding, and therefore the compiler does
> not
> need to clear the unobserved padding bits. And it can be written now,
> relying
> on the compilers emitting suitable code even in spite of it being UB (as
> per
> your intro, but see my feedback email).
>
> But it only works if we fix std::bit_cast<unsigned __int128>.
>
I suspect that this involves more work than you think. First and foremost,
this prevents UBSan from diagnosing what is essentially a read of
uninitialized memory. I also don't want to encourage people to write code
where some of the bits in their values are unspecified. That seems like
playing with fire. For example, all it takes is one typo in a bit mask you
use to discard those upper unspecified bits, and you're branching on an
unspecified value.
I also don't see how this would work during constant evaluation. We don't
usually want unspecified behavior at compile time, and defining when a
bitwise AND "un-unspecifies" a value (and other such operations) seems like
a lot of core language work which is not warranted here at all.
> But what's the point of leaving those bits indeterminate, even in a byte
> array? What can one do to them?
>
One can and should ignore them basically. There are existing ways to type
pun long double to __int128. For example, bit-cast to a byte array, clear
the upper 6 bytes by hand, and then bit-cast a second time to __int128.
It's ugly, but it works.
It would be a bug to read those upper 6 bytes, and making them
indeterminate allows UBSan to catch the bug.
> As in my reply as feedback: the compiler can avoid emitting the
> bit-clearing
> code if it can tell you did not observe the cleared bits. For example:
>
> uint64_t mantissa_of(long double ldbl)
> {
> return std::bit_cast<unsigned __int128>(ldbl);
> }
>
> This returns the 64 bits of the mantissa of an 80-bit extended precision
> long
> double, which never contain any padding, and therefore the compiler does
> not
> need to clear the unobserved padding bits. And it can be written now,
> relying
> on the compilers emitting suitable code even in spite of it being UB (as
> per
> your intro, but see my feedback email).
>
> But it only works if we fix std::bit_cast<unsigned __int128>.
>
I suspect that this involves more work than you think. First and foremost,
this prevents UBSan from diagnosing what is essentially a read of
uninitialized memory. I also don't want to encourage people to write code
where some of the bits in their values are unspecified. That seems like
playing with fire. For example, all it takes is one typo in a bit mask you
use to discard those upper unspecified bits, and you're branching on an
unspecified value.
I also don't see how this would work during constant evaluation. We don't
usually want unspecified behavior at compile time, and defining when a
bitwise AND "un-unspecifies" a value (and other such operations) seems like
a lot of core language work which is not warranted here at all.
Received on 2026-01-17 16:08:00
