Date: Fri, 16 Jan 2026 13:42:47 +0100
> As I understand it (and I am not sure here, and happy to be corrected),
> padding has unspecified values, but when you copy them with bit_cast<>
> you get indeterminate bits. And then if those bits are read, you have UB.
>
Yes, that's correct.
> Could it be possible to change bit_cast<> so that these copied bits
> remain unspecified? I believe an aim of the way bit_cast<> is defined
> is to let it work without having to copy any padding (bits or bytes),
> thus the returned value could have indeterminate bits. If it is
> possible to re-classify these bits as unspecified rather than
> indeterminate, without loss of efficiency, then that would be an
> improvement, I think. (There may be some hardware platforms that track
> determinate / indeterminate status of data.)
>
The obvious problem is constant evaluation. In the aforementioned case of
bit-casting long double to __int128, you don't want to end up with
non-deterministic integer bits at compile time. We could say that you don't
have a constant expression in the degenerate case, or that the bits are
guaranteed to be zero during constant evaluation, but any solution here
creates some deviation in behavior between run-time and constant evaluation.
> I think some kind of "pad to zero" concept could have uses beyond this
> case, and be useful in its own right. You could have :
>
> T padded_to_zero(const T& x)
>
> that would return a copy of x, where the all padding bits and bytes are
> set to zero. And you could have :
>
> pad_to_zero(T& x)
>
> that would zero out the padding on an existing object.
>
That does not make any sense in the C++ object model. Padding bits are bits
that are not in the value representation, and they don't get
copied/preserved when values are copied. If you store the result of
padded_to_zero in a variable and copy that variable into a second one, the
padding bits of these variables may differ. It's not even clear how you can
store the result of padded_to_zero in a variable in a way that keeps
padding bits, considering that the implementation does not copy the padding
bits from the function's result value into the variable at this time.
For all intents and purposes, padding bits do not exist, and you cannot
manipulate them or examine their value.
> The key point with these would be that you would have a consistent
> underlying representation of the object, for use with things like
> hashing, digital signatures, memcmp-style comparisons, etc.
>
Using memcmp on a type with padding bits always results in undefined
behavior, and I would not want it to be any other way.
> But it would not solve the indeterminate bits in bit_cast<>, without
> changes to the specification of bit_cast<>. padded_to_zero(x) would
> still have the same padding bits and bytes, and these would still be
> indeterminate after the bit_cast<>.
>
Yes, we need to make changes to bit_cast to deal with these issues.
> padding has unspecified values, but when you copy them with bit_cast<>
> you get indeterminate bits. And then if those bits are read, you have UB.
>
Yes, that's correct.
> Could it be possible to change bit_cast<> so that these copied bits
> remain unspecified? I believe an aim of the way bit_cast<> is defined
> is to let it work without having to copy any padding (bits or bytes),
> thus the returned value could have indeterminate bits. If it is
> possible to re-classify these bits as unspecified rather than
> indeterminate, without loss of efficiency, then that would be an
> improvement, I think. (There may be some hardware platforms that track
> determinate / indeterminate status of data.)
>
The obvious problem is constant evaluation. In the aforementioned case of
bit-casting long double to __int128, you don't want to end up with
non-deterministic integer bits at compile time. We could say that you don't
have a constant expression in the degenerate case, or that the bits are
guaranteed to be zero during constant evaluation, but any solution here
creates some deviation in behavior between run-time and constant evaluation.
> I think some kind of "pad to zero" concept could have uses beyond this
> case, and be useful in its own right. You could have :
>
> T padded_to_zero(const T& x)
>
> that would return a copy of x, where the all padding bits and bytes are
> set to zero. And you could have :
>
> pad_to_zero(T& x)
>
> that would zero out the padding on an existing object.
>
That does not make any sense in the C++ object model. Padding bits are bits
that are not in the value representation, and they don't get
copied/preserved when values are copied. If you store the result of
padded_to_zero in a variable and copy that variable into a second one, the
padding bits of these variables may differ. It's not even clear how you can
store the result of padded_to_zero in a variable in a way that keeps
padding bits, considering that the implementation does not copy the padding
bits from the function's result value into the variable at this time.
For all intents and purposes, padding bits do not exist, and you cannot
manipulate them or examine their value.
> The key point with these would be that you would have a consistent
> underlying representation of the object, for use with things like
> hashing, digital signatures, memcmp-style comparisons, etc.
>
Using memcmp on a type with padding bits always results in undefined
behavior, and I would not want it to be any other way.
> But it would not solve the indeterminate bits in bit_cast<>, without
> changes to the specification of bit_cast<>. padded_to_zero(x) would
> still have the same padding bits and bytes, and these would still be
> indeterminate after the bit_cast<>.
>
Yes, we need to make changes to bit_cast to deal with these issues.
Received on 2026-01-16 12:43:01
