C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Fixing std::bit_cast padding bit issues

From: David Brown <david.brown_at_[hidden]>
Date: Sun, 18 Jan 2026 17:13:18 +0100
On 18/01/2026 16:37, Thiago Macieira wrote:
> On Sunday, 18 January 2026 03:07:40 Pacific Standard Time David Brown wrote:
>> On 18/01/2026 00:24, Thiago Macieira wrote:
>>> On Saturday, 17 January 2026 12:03:13 Pacific Standard Time David Brown
> wrote:
>>>> And there are plenty of situations where you would want the padding to
>>>> be seen as zeros, or at least to be consistent. Taking a hash of some
>>>> kind is the obvious case - whether it be for use in a hash-map
>>>> structure, for integrity checking (CRC, md5sum), or for secure hashing.
>>>> You might also want it to be zeroed out before passing the object
>>>> outside the program - storing it in a file, or transmitting it on a
>>>> network. Again, it is often helpful for data to be consistent, and more
>>>> paranoid people might want to avoid any risk of leakage of unintended
>>>> data. Often it is most efficient and convenient to access the data here
>>>> using std::byte or unsigned char pointers - and that means the padding
>>>> is visible.
>>>
>>> Most of which require a bit_cast to a byte array in the first place.
>>
>> Do they? People have been doing this sort of thing with memcpy and
>> unsigned char pointers from /long/ before C++20 bit_cast<>, both in C
>> and C++ (AFAIK C++ defers to the C standard for memcpy). bit_cast<>
>> might be a more modern way to deal with this kind of thing, and it will
>> likely be a better route towards consistent compile-time and run-time
>> behaviour, but accessing representations via unsigned char pointer has
>> always been fundamental to low-level work in C and C++. And the unknown
>> contents of padding has been a PITA all that time. Rightly or wrongly,
>> people treat coping the padding via character pointers as getting
>> unspecified data, not as leading to undefined behaviour.
>
> What people have been doing is not important, because they are not using any
> of the solutions we're talking about yet. We are discussion what they need to
> use now.
>
> I also argue there's no such code today because there is no way to clear
> padding. Therefore, no one is knowingly using hashes and checksum functions
> over the storage of an object that contains padding.
>

I don't feel I am fully convinced, but I am happy to move on here.
Getting a good solution for the future is more important.

> So we are talking about a new use-case that we can enable. And therefore, by
> construction, both solutions are acceptable. Which would be most convenient?
>
> There's value in std::clear_padding for this use-case: it wouldn't require the
> callee to create a new buffer of the object before reading the memory
> representation in order to calculate whatever they need to. Since those are
> padding bits, the caller can afford to clear them in their working object,
> since by definition they aren't used to store any value.
>

I agree that clearing padding in an existing object should often be the
most efficient method. After all, padding will normally be no more than
a very small fraction of real objects (except for bool's, which are
generally 1 value bit and 7 padding bits, but ABI's typically keep the
padding bits at zero already). So generally a "clear_padding" call will
have little work to do.

Copying the object would only really be needed if it is passed by const
reference or const pointer - zeroing the padding bits would then be UB
even if it is the case that those bits are not observable (though I
believe they /are/ observable).

Received on 2026-01-18 16:13:26