Date: Sat, 17 Jan 2026 08:21:07 -0800
On Saturday, 17 January 2026 08:07:44 Pacific Standard Time Jan Schultke wrote:
> > But what's the point of leaving those bits indeterminate, even in a byte
> > array? What can one do to them?
>
> One can and should ignore them basically. There are existing ways to type
> pun long double to __int128. For example, bit-cast to a byte array, clear
> the upper 6 bytes by hand, and then bit-cast a second time to __int128.
> It's ugly, but it works.
I know, that's exactly what I did for qHash. Though note I have to use
memcpy(), not bit_cast yet.
But again my point: if the only objective of keeping the indeterminate bits is
to subsequently clear them, why not let bit_cast clear them in the first place?
> It would be a bug to read those upper 6 bytes, and making them
> indeterminate allows UBSan to catch the bug.
And making bit_cast clear them removes the bug.
> > As in my reply as feedback: the compiler can avoid emitting the
> > bit-clearing
> > code if it can tell you did not observe the cleared bits. For example:
> >
> > uint64_t mantissa_of(long double ldbl)
> > {
> > return std::bit_cast<unsigned __int128>(ldbl);
> > }
>
> I suspect that this involves more work than you think. First and foremost,
> this prevents UBSan from diagnosing what is essentially a read of
> uninitialized memory. I also don't want to encourage people to write code
> where some of the bits in their values are unspecified. That seems like
> playing with fire. For example, all it takes is one typo in a bit mask you
> use to discard those upper unspecified bits, and you're branching on an
> unspecified value.
Your reply is only applicable if the cast is not UB, but keeps indeterminate
bits in the resulting __int128. That's not what I am arguing for. I am arguing
that the bits should be cleared by the bit_cast, which in turn means there is
no bug for UBSan to catch in the first place.
The point I was trying to make with the example is that the actual clearing
does not need to happen in machine code, because the upper half was discarded.
There would be no performance penalty to the above code semantically clearing
the bits.
> > But what's the point of leaving those bits indeterminate, even in a byte
> > array? What can one do to them?
>
> One can and should ignore them basically. There are existing ways to type
> pun long double to __int128. For example, bit-cast to a byte array, clear
> the upper 6 bytes by hand, and then bit-cast a second time to __int128.
> It's ugly, but it works.
I know, that's exactly what I did for qHash. Though note I have to use
memcpy(), not bit_cast yet.
But again my point: if the only objective of keeping the indeterminate bits is
to subsequently clear them, why not let bit_cast clear them in the first place?
> It would be a bug to read those upper 6 bytes, and making them
> indeterminate allows UBSan to catch the bug.
And making bit_cast clear them removes the bug.
> > As in my reply as feedback: the compiler can avoid emitting the
> > bit-clearing
> > code if it can tell you did not observe the cleared bits. For example:
> >
> > uint64_t mantissa_of(long double ldbl)
> > {
> > return std::bit_cast<unsigned __int128>(ldbl);
> > }
>
> I suspect that this involves more work than you think. First and foremost,
> this prevents UBSan from diagnosing what is essentially a read of
> uninitialized memory. I also don't want to encourage people to write code
> where some of the bits in their values are unspecified. That seems like
> playing with fire. For example, all it takes is one typo in a bit mask you
> use to discard those upper unspecified bits, and you're branching on an
> unspecified value.
Your reply is only applicable if the cast is not UB, but keeps indeterminate
bits in the resulting __int128. That's not what I am arguing for. I am arguing
that the bits should be cleared by the bit_cast, which in turn means there is
no bug for UBSan to catch in the first place.
The point I was trying to make with the example is that the actual clearing
does not need to happen in machine code, because the upper half was discarded.
There would be no performance penalty to the above code semantically clearing
the bits.
-- Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org Principal Engineer - Intel Data Center - Platform & Sys. Eng.
Received on 2026-01-17 16:21:11
