C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Fixing std::bit_cast padding bit issues

From: Andrey Semashev <andrey.semashev_at_[hidden]>
Date: Fri, 23 Jan 2026 14:29:24 +0300
On 23 Jan 2026 08:41, Jan Schultke via Std-Proposals wrote:
> I think that's the wrong question to ask. Let's take the
> hypothetical example
> of a popcount over a long double:
>
> int popcount(long double f)
> {
> std::clear_padding(f);
> return std::popcount(reinterpret_cast<uint128_t &>(f));
> }
>
> It doesn't *need* to emit code. In fact, I'd expect this to
> *suppress* code by
> knowing that 6 of the 16 bytes are zero and thus don't contribute to
> the count
> of bits set. The compiler should realise it only needs to popcount
> 8+2 bytes.
>
> The important question is that of behaviour. So long as it behaves
> as if the
> bits have been cleared, any as-if transformation is fine.
>
> So if you put a piece of code (maybe small, maybe large) between
> std::clear_padding and std::popcount, the compiler may not optimize
> std::popcount the way you want and give you an unspecified result? This
> seems dangerously close to the "it's undefined behavior, but it did
> generate the right code once on my machine, and that's all the
> guarantees I need!" approach to C++ programming. Actually ... it's not
> just dangerously close, the way you use reinterpret_cast is simply UB.
>
> I think Sebastian is asking the right question here; you should be able
> to put an arbitrary distance between std::clear_padding and
> std::popcount and retain the semantics.

Why?

The code is supposed to work as it is written. The fact that a different
code may behave differently is not an argument for disallowing the
former piece of code. If the developer intends to ensure that the
padding is cleared for the subsequent consumption, he will write the
code accordingly, and the standard should guarantee that the code
behaves as expected.

BTW, I don't think the specific example above with popcount is valid; it
should become valid if reinterpret_cast is replaced with bit_cast.

> My intuition is that you could get those guarantees if you ran
> std::clear_padding, std::memcpy to a byte array, and then std::bit_cast
> out of a byte array into std::uint128_t. However, that forces compilers
> to permanently spill objects from registers (such as function parameters
> passed via register) into memory when std::clear_padding is used on them
> because it's possible that someone will later read the padding bits.

No. Whether the compiler spills registers is a matter of QoI. The
compiler may keep track internally that the padding is cleared and
optimize the subsequent code accordingly. Current compilers are very
good at tracking bit patterns and optimizing based on them already.

This wouldn't be the case in the specific case with x87 long double
because there aren't instructions to directly compute popcount on an x87
register or to pass an x87 register to a GPR. Well, there is MMX, but it
only accesses 64 bits of the x87 registers, not 80. So the compiler will
have to pass the value through memory, but that is not a
standard-imposed limitation but a hardware-imposed one.

Received on 2026-01-23 11:29:27