Date: Sat, 17 Jan 2026 15:54:27 +0300
On 16 Jan 2026 20:29, Jan Schultke wrote:
>
> The implementation cannot not copy the padding bits when the user
> invokes memcpy.
>
> That is wrong. There is no way to observe that padding bits have been
> copied anyway, so this can be optimized away under the as-if rule. In
> fact, memcpy can (and in practice is) optimized to a mov between
> registers when possible, and a mov of x87 floating-point types does not
> copy padding bits because none exist in the first place.
You are confusing the code generated by the implementation and the
language specification again. The compiler generates this code because
the standard permits the padding bytes to be not preserved during
regular value copies. It does not give such liberty with memcpy.
The C++ standard refers to C for definition of memcpy, and C defines it
as such:
void *memcpy(void * restrict s1, const void * restrict s2, size_t n);
Description
The memcpy function copies n characters from the object pointed to by
s2 into the object pointed to by s1. If copying takes place between
objects that overlap, the behavior is undefined.
C23, 7.26.2.1 The memcpy function.
So, to use your example with 80-bit long double on x86, where
sizeof(long double) is 16, the behavior of memcpy must be equivalent to
copying 16 bytes from the source to target, not 10. And this can be
observed by a program such as this:
bool test(const long double& d1)
{
unsigned char buf1[sizeof(long double)];
memcpy(buf1, &d1, sizeof(long double));
long double d2;
memcpy(&d2, buf1, sizeof(long double));
unsigned char buf2[sizeof(long double)];
memcpy(buf2, &d2, sizeof(long double));
return memcmp(buf1, buf2, sizeof(long double)) == 0;
}
This function must always return true, under the specification of memcpy
and memcmp (which is also defined in C as operating on n characters).
Whether the compiler generates code that does the actual copies and
comparison of 16 bytes, or 10 bytes (for *both* copying and comparison)
or none at all and just generates an equivalent of `return true` is QoI.
What it cannot do is pretend that padding is somehow volatile (as in, is
allowed to magically change its bit values out of thin air) and generate
code that may return false.
> It's like arguing that creating an int variable requires the compiler to
> put four bytes on stack memory. Just no.
It requires the compiler to generate code that behaves as if there is
one. And sometimes it does indeed force the compiler to allocate four
bytes on the stack.
> It could theoretically do something funny with them when
> the user invokes bit_cast, but that is still acceptable if the result is
> usable (specifically, those bits of the result that were not produced
> from the input padding bits).
>
> Yes, in fact it does something funny with constant evaluation; as
> mentioned, LLVM does not have an object representation and does not have
> any tracked padding bits at compile time. Object representations for
> bit-cating are generated on the fly.
For reference, here is how bit_cast behavior is defined:
https://eel.is/c++draft/bit.cast#4
Specifically, these two sentences are related to handling of padding:
1. Padding bits of the result are unspecified.
2. A bit in the value representation of the result is indeterminate if
it does not correspond to a bit in the value representation of `from`...
This is what gives the compiler the liberty of doing funny stuff with
padding. #1 allows for not handling input bits that do not map onto
output value representation bits and #2 allows for ignoring input
padding bits and instead producing garbage in the corresponding output
value representation bits.
There is no way to examine padding bits in compile time (yet), so the
behavior wrt. padding is not important in constexpr at this point. There
is a way to examine padding in runtime (e.g. you can access buf1/buf2
elements in the code snippet above), so if we want consistent behavior
in runtime, those sentences need to change.
I think, this specification should be made closer to memcpy. That is, it
should amount to copying input bits to the output bits, whether the
input bits are padding or value representation. This would still mean
that the result is indeterminate if input padding bits aren't cleared
previously.
> Noone is proposing to rely on padding bits. On the contrary, I and David
> propose to add an utility for clearing padding bits, which, if anything,
> could improve security.
>
> But your utility has no observable effect in the abstract machine.
> Padding bits do not really exist (see the two examples I just brought
> up). You're clearing bits that don't really exist, and the user has no
> way to observe their values anyway.
Again, padding bits do exist and the standard acknowledges them in
multiple places, including bit_cast specification itself. Please, stop
this nonsense about magical non-existing padding bits. They do not
participate in value representation and the compiler is not required to
preserve their values in normal operations on the object, but that's
about it when it comes to their specialty.
The proposed clear_padding utility would be defined as "sets all bits in
the object representation that do not participate in value
representation to zero" or something along these lines. This is a
perfectly reasonable definition that is in line with the existing object
model.
> memcpy must write something to the target bits where the source padding
> bits map onto. You could argue that memcpy could magically produce
> garbage there instead of actually reading it from source (although no
> implementation does such silly things, to my knowledge), but that would
> still be acceptable if the user doesn't use those garbage bits. And I
> maintain, the user should be allowed to do that, I see no reason why he
> shouldn't.
>
> No, it's not required to do anything like that because you're not able
> to observe whether it actually copied memory.
This is wrong, see above.
> Memory is an
> implementation detail, and the implementation is free to not copy
> anything if it knows that all the input bytes are indeterminate anyway.
> There is no observable distinction between any two indeterminate bytes.
The value of padding bits is unspecified, but padding bits are not
volatile. They cannot change their value out of the blue. This is
observable in runtime, and the compiler must generate code that behaves
accordingly.
> So what you're saying is that there is no legal way to bit_cast a
> _BitInt(3) to something else (e.g. uint8_t), that this is a good thing?
> I'm sorry, but I disagree.
>
> I'm saying it's not a good thing, and it would be useful to have
> something like std::bit_cast_zero_padding, or for padding wipes to be
> the default behavior of std::bit_cast. I just don't see any need to
> overhaul the object model, possibly disallowing vast amounts of
> optimizations.
I don't see why overhauling the object model or disabling vast amounts
of optimizations would be needed. The only two things that are changing are:
1. Adding clear_padding that sets padding bits to zero, which is exactly
what we want. Since this is a new function, it doesn't affect existing code.
2. Making bit_cast preserve input padding bits that map onto value
representation bits in the output. Although it may affect existing code,
I doubt this will cause major performance differences.
>
> The implementation cannot not copy the padding bits when the user
> invokes memcpy.
>
> That is wrong. There is no way to observe that padding bits have been
> copied anyway, so this can be optimized away under the as-if rule. In
> fact, memcpy can (and in practice is) optimized to a mov between
> registers when possible, and a mov of x87 floating-point types does not
> copy padding bits because none exist in the first place.
You are confusing the code generated by the implementation and the
language specification again. The compiler generates this code because
the standard permits the padding bytes to be not preserved during
regular value copies. It does not give such liberty with memcpy.
The C++ standard refers to C for definition of memcpy, and C defines it
as such:
void *memcpy(void * restrict s1, const void * restrict s2, size_t n);
Description
The memcpy function copies n characters from the object pointed to by
s2 into the object pointed to by s1. If copying takes place between
objects that overlap, the behavior is undefined.
C23, 7.26.2.1 The memcpy function.
So, to use your example with 80-bit long double on x86, where
sizeof(long double) is 16, the behavior of memcpy must be equivalent to
copying 16 bytes from the source to target, not 10. And this can be
observed by a program such as this:
bool test(const long double& d1)
{
unsigned char buf1[sizeof(long double)];
memcpy(buf1, &d1, sizeof(long double));
long double d2;
memcpy(&d2, buf1, sizeof(long double));
unsigned char buf2[sizeof(long double)];
memcpy(buf2, &d2, sizeof(long double));
return memcmp(buf1, buf2, sizeof(long double)) == 0;
}
This function must always return true, under the specification of memcpy
and memcmp (which is also defined in C as operating on n characters).
Whether the compiler generates code that does the actual copies and
comparison of 16 bytes, or 10 bytes (for *both* copying and comparison)
or none at all and just generates an equivalent of `return true` is QoI.
What it cannot do is pretend that padding is somehow volatile (as in, is
allowed to magically change its bit values out of thin air) and generate
code that may return false.
> It's like arguing that creating an int variable requires the compiler to
> put four bytes on stack memory. Just no.
It requires the compiler to generate code that behaves as if there is
one. And sometimes it does indeed force the compiler to allocate four
bytes on the stack.
> It could theoretically do something funny with them when
> the user invokes bit_cast, but that is still acceptable if the result is
> usable (specifically, those bits of the result that were not produced
> from the input padding bits).
>
> Yes, in fact it does something funny with constant evaluation; as
> mentioned, LLVM does not have an object representation and does not have
> any tracked padding bits at compile time. Object representations for
> bit-cating are generated on the fly.
For reference, here is how bit_cast behavior is defined:
https://eel.is/c++draft/bit.cast#4
Specifically, these two sentences are related to handling of padding:
1. Padding bits of the result are unspecified.
2. A bit in the value representation of the result is indeterminate if
it does not correspond to a bit in the value representation of `from`...
This is what gives the compiler the liberty of doing funny stuff with
padding. #1 allows for not handling input bits that do not map onto
output value representation bits and #2 allows for ignoring input
padding bits and instead producing garbage in the corresponding output
value representation bits.
There is no way to examine padding bits in compile time (yet), so the
behavior wrt. padding is not important in constexpr at this point. There
is a way to examine padding in runtime (e.g. you can access buf1/buf2
elements in the code snippet above), so if we want consistent behavior
in runtime, those sentences need to change.
I think, this specification should be made closer to memcpy. That is, it
should amount to copying input bits to the output bits, whether the
input bits are padding or value representation. This would still mean
that the result is indeterminate if input padding bits aren't cleared
previously.
> Noone is proposing to rely on padding bits. On the contrary, I and David
> propose to add an utility for clearing padding bits, which, if anything,
> could improve security.
>
> But your utility has no observable effect in the abstract machine.
> Padding bits do not really exist (see the two examples I just brought
> up). You're clearing bits that don't really exist, and the user has no
> way to observe their values anyway.
Again, padding bits do exist and the standard acknowledges them in
multiple places, including bit_cast specification itself. Please, stop
this nonsense about magical non-existing padding bits. They do not
participate in value representation and the compiler is not required to
preserve their values in normal operations on the object, but that's
about it when it comes to their specialty.
The proposed clear_padding utility would be defined as "sets all bits in
the object representation that do not participate in value
representation to zero" or something along these lines. This is a
perfectly reasonable definition that is in line with the existing object
model.
> memcpy must write something to the target bits where the source padding
> bits map onto. You could argue that memcpy could magically produce
> garbage there instead of actually reading it from source (although no
> implementation does such silly things, to my knowledge), but that would
> still be acceptable if the user doesn't use those garbage bits. And I
> maintain, the user should be allowed to do that, I see no reason why he
> shouldn't.
>
> No, it's not required to do anything like that because you're not able
> to observe whether it actually copied memory.
This is wrong, see above.
> Memory is an
> implementation detail, and the implementation is free to not copy
> anything if it knows that all the input bytes are indeterminate anyway.
> There is no observable distinction between any two indeterminate bytes.
The value of padding bits is unspecified, but padding bits are not
volatile. They cannot change their value out of the blue. This is
observable in runtime, and the compiler must generate code that behaves
accordingly.
> So what you're saying is that there is no legal way to bit_cast a
> _BitInt(3) to something else (e.g. uint8_t), that this is a good thing?
> I'm sorry, but I disagree.
>
> I'm saying it's not a good thing, and it would be useful to have
> something like std::bit_cast_zero_padding, or for padding wipes to be
> the default behavior of std::bit_cast. I just don't see any need to
> overhaul the object model, possibly disallowing vast amounts of
> optimizations.
I don't see why overhauling the object model or disabling vast amounts
of optimizations would be needed. The only two things that are changing are:
1. Adding clear_padding that sets padding bits to zero, which is exactly
what we want. Since this is a new function, it doesn't affect existing code.
2. Making bit_cast preserve input padding bits that map onto value
representation bits in the output. Although it may affect existing code,
I doubt this will cause major performance differences.
Received on 2026-01-17 12:54:31
