Date: Fri, 23 Jan 2026 06:41:28 +0100
>
> I think that's the wrong question to ask. Let's take the hypothetical
> example
> of a popcount over a long double:
>
> int popcount(long double f)
> {
> std::clear_padding(f);
> return std::popcount(reinterpret_cast<uint128_t &>(f));
> }
>
> It doesn't *need* to emit code. In fact, I'd expect this to *suppress*
> code by
> knowing that 6 of the 16 bytes are zero and thus don't contribute to the
> count
> of bits set. The compiler should realise it only needs to popcount 8+2
> bytes.
>
> The important question is that of behaviour. So long as it behaves as if
> the
> bits have been cleared, any as-if transformation is fine.
>
So if you put a piece of code (maybe small, maybe large) between
std::clear_padding and std::popcount, the compiler may not optimize
std::popcount the way you want and give you an unspecified result? This
seems dangerously close to the "it's undefined behavior, but it did
generate the right code once on my machine, and that's all the guarantees I
need!" approach to C++ programming. Actually ... it's not just dangerously
close, the way you use reinterpret_cast is simply UB.
I think Sebastian is asking the right question here; you should be able to
put an arbitrary distance between std::clear_padding and std::popcount and
retain the semantics.
My intuition is that you could get those guarantees if you ran
std::clear_padding, std::memcpy to a byte array, and then std::bit_cast out
of a byte array into std::uint128_t. However, that forces compilers to
permanently spill objects from registers (such as function parameters
passed via register) into memory when std::clear_padding is used on them
because it's possible that someone will later read the padding bits. Maybe
merely the possibility that std::clear_padding is run in some opaque
function causes pessimizations. This would definitely need to be
investigated when proposing a std::clear_padding function.
> I think that's the wrong question to ask. Let's take the hypothetical
> example
> of a popcount over a long double:
>
> int popcount(long double f)
> {
> std::clear_padding(f);
> return std::popcount(reinterpret_cast<uint128_t &>(f));
> }
>
> It doesn't *need* to emit code. In fact, I'd expect this to *suppress*
> code by
> knowing that 6 of the 16 bytes are zero and thus don't contribute to the
> count
> of bits set. The compiler should realise it only needs to popcount 8+2
> bytes.
>
> The important question is that of behaviour. So long as it behaves as if
> the
> bits have been cleared, any as-if transformation is fine.
>
So if you put a piece of code (maybe small, maybe large) between
std::clear_padding and std::popcount, the compiler may not optimize
std::popcount the way you want and give you an unspecified result? This
seems dangerously close to the "it's undefined behavior, but it did
generate the right code once on my machine, and that's all the guarantees I
need!" approach to C++ programming. Actually ... it's not just dangerously
close, the way you use reinterpret_cast is simply UB.
I think Sebastian is asking the right question here; you should be able to
put an arbitrary distance between std::clear_padding and std::popcount and
retain the semantics.
My intuition is that you could get those guarantees if you ran
std::clear_padding, std::memcpy to a byte array, and then std::bit_cast out
of a byte array into std::uint128_t. However, that forces compilers to
permanently spill objects from registers (such as function parameters
passed via register) into memory when std::clear_padding is used on them
because it's possible that someone will later read the padding bits. Maybe
merely the possibility that std::clear_padding is run in some opaque
function causes pessimizations. This would definitely need to be
investigated when proposing a std::clear_padding function.
Received on 2026-01-23 05:41:41
