C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Fixing std::bit_cast padding bit issues

From: David Brown <david.brown_at_[hidden]>
Date: Fri, 16 Jan 2026 14:37:32 +0100
On 16/01/2026 13:42, Jan Schultke wrote:
>
> As I understand it (and I am not sure here, and happy to be corrected),
> padding has unspecified values, but when you copy them with bit_cast<>
> you get indeterminate bits. And then if those bits are read, you
> have UB.
>
>
> Yes, that's correct.
>

Thanks. These details are not always easy to get entirely correct, so I
appreciate the confirmation.

> Could it be possible to change bit_cast<> so that these copied bits
> remain unspecified? I believe an aim of the way bit_cast<> is defined
> is to let it work without having to copy any padding (bits or bytes),
> thus the returned value could have indeterminate bits. If it is
> possible to re-classify these bits as unspecified rather than
> indeterminate, without loss of efficiency, then that would be an
> improvement, I think. (There may be some hardware platforms that track
> determinate / indeterminate status of data.)
>
>
> The obvious problem is constant evaluation. In the aforementioned case
> of bit-casting long double to __int128, you don't want to end up with
> non-deterministic integer bits at compile time. We could say that you
> don't have a constant expression in the degenerate case, or that the
> bits are guaranteed to be zero during constant evaluation, but any
> solution here creates some deviation in behavior between run-time and
> constant evaluation.
>

I agree - consistency is important here.

> I think some kind of "pad to zero" concept could have uses beyond this
> case, and be useful in its own right. You could have :
>
> T padded_to_zero(const T& x)
>
> that would return a copy of x, where the all padding bits and bytes are
> set to zero. And you could have :
>
> pad_to_zero(T& x)
>
> that would zero out the padding on an existing object.
>
>
> That does not make any sense in the C++ object model. Padding bits are
> bits that are not in the value representation, and they don't get
> copied/preserved when values are copied.

They are not copied or preserved with most copies, but they /are/
accessible via things like memcpy and access using character or
std::byte pointers. So sometimes they are relevant despite not being
part of the value of the object.

> If you store the result of
> padded_to_zero in a variable and copy that variable into a second one,
> the padding bits of these variables may differ.

Yes - /if/ you copy the object as a value, rather than the underlying
representation.

Suppose we have a structure with padding :

 struct S { uint8_t a; uint32_t c; };

Assuming common alignments (and ignoring any other implementation
dependent details for simplicity), there be 3 bytes of padding between
"a" and "c" - somewhat akin to a hidden field "unsigned char b[3];".

If you create an object S s1, then it's "b" field is unspecified. If
you write "S s2 = s1;", then s2's "b" field is also unspecified, and you
have no reason to suppose that s2.b and s1.b would be equal (even if it
made sense to compare them). "memcmp(&s1, &s2, sizeof(S))" is not
guaranteed to give 0.

However, if you write "memcpy(&s2, &s1, sizeof(S));", then I think it is
reasonable to expect "memcmp(&s1, &s2, sizeof(S))" to give 0 - even
though the padding bytes are still unspecified - because you have copied
a block of bytes, not values of type S.

Am I correct to expect this? Or do all guarantees or information about
the padding bytes "disappear" as soon as the memcpy() is complete?

So if we have "S s1" and write "pad_to_zero(s1)", the result will be the
same as if we had written "memset((unsigned char*) &s1 + 1, 0, 3)" to
zero out the hidden "b" field.

If we then write "S s2 = s1;", then of course the padding bytes could be
different.

But if we write "uint32_t x = sha256((const unsigned char*) &s1,
sizeof(S));", then the function that is accessing the underlying storage
via a character pointer (or as std::byte) would see the struct with
zeros in the padding bytes.

> It's not even clear how
> you can store the result of padded_to_zero in a variable in a way that
> keeps padding bits, considering that the implementation does not copy
> the padding bits from the function's result value into the variable at
> this time.
>
> For all intents and purposes, padding bits do not exist, and you cannot
> manipulate them or examine their value.
>
> The key point with these would be that you would have a consistent
> underlying representation of the object, for use with things like
> hashing, digital signatures, memcmp-style comparisons, etc.
>
>
> Using memcmp on a type with padding bits always results in undefined
> behavior, and I would not want it to be any other way.
>

When I write structs myself that would normally contain padding, and
where I want to have consistency (because transferring structs into and
out of code, hashes, and data serialisation is not uncommon), I add
explicit "padding" or "dummy" fields so that I know everything is
well-defined. But it would be much more convenient to be able to do
this kind of thing in a more automated manner. But if there is no way
to make sense of this within the C++ object model, so be it.


> But it would not solve the indeterminate bits in bit_cast<>, without
> changes to the specification of bit_cast<>. padded_to_zero(x) would
> still have the same padding bits and bytes, and these would still be
> indeterminate after the bit_cast<>.
>
>
> Yes, we need to make changes to bit_cast to deal with these issues.
>

I can certainly appreciate your desired changes to bit_cast, whether or
not any kind of "pad_to_zero" or "padded_to_zero" could exist.

Received on 2026-01-16 13:37:39