Date: Fri, 14 Mar 2025 22:46:06 -0400
On Fri, Mar 14, 2025 at 11:07 AM Giuseppe D'Angelo via Std-Discussion <
std-discussion_at_[hidden]> wrote:
> > otherwise, the bytes have erroneous values, where each value is
> > determined by the implementation independently of the state of the
> > program.
>
> I understood that sentence in the sense that filling isn't required,
> because the values you'd eventually read from the buffer don't come from
> the state of the program (they don't even exist, as far as the abstract
> machine is concerned?).
>
> I may be super-wrong with this interpretation, though, and I'd like some
> clarity as well. If I'm mistaken, I 100% agree with you that it would be
> a completely unacceptable performance hit for any class that wraps a
> buffer of some sort (incl. any form of short object optimization).
>
Right, I was confused by this as well (see my earlier message). Like
Giuseppe, my first reading was that a program with erroneous behavior which
does execute just needs to act as if every erroneous value has a specific
value, rather than the undefined behavior "anything is correct" freedom.
And that's exactly what most practical release-mode compiler
implementations will actually do already.
"Independently of the state of the program" is very unclear to me. It
doesn't fit the C++ abstract machine context at all.
P2723R1 suggests one main motivation is to prevent a type of security hole
where one function writes "sensitive" data to the stack, then after that
function has popped and another function is using the same region of stack
memory, it might end up sending those old sensitive byte values to an
output, via reading uninitialized variables, padding bytes, or bytes in a
union but not in the active member. That might explain why the changes are
just for automatic storage, not for dynamic storage: it's still possible
old sensitive data exists in heap storage, but maybe it's harder for an
exploit to get at that sort of memory, which is reused in much less
localized ways. If that's what's intended, one requirement that could work
in terms of the abstract machine would be that (if the implementation
doesn't prevent the execution entirely because of the EB,) every instance
of storage associated with the same automatic block variable's definition
initially has the same erroneous values, even for different executions of
the same program. But in terms of [intro.abstract], this is not
"implementation-defined behavior" if we want to allow optimizations other
than always using zero or some other defined byte pattern. It is not
"unspecified behavior", since that's the context when [intro.abstract] says
"An instance of the abstract machine can thus have more than one possible
execution for a given program and a given input."
-- Andrew Schepler
std-discussion_at_[hidden]> wrote:
> > otherwise, the bytes have erroneous values, where each value is
> > determined by the implementation independently of the state of the
> > program.
>
> I understood that sentence in the sense that filling isn't required,
> because the values you'd eventually read from the buffer don't come from
> the state of the program (they don't even exist, as far as the abstract
> machine is concerned?).
>
> I may be super-wrong with this interpretation, though, and I'd like some
> clarity as well. If I'm mistaken, I 100% agree with you that it would be
> a completely unacceptable performance hit for any class that wraps a
> buffer of some sort (incl. any form of short object optimization).
>
Right, I was confused by this as well (see my earlier message). Like
Giuseppe, my first reading was that a program with erroneous behavior which
does execute just needs to act as if every erroneous value has a specific
value, rather than the undefined behavior "anything is correct" freedom.
And that's exactly what most practical release-mode compiler
implementations will actually do already.
"Independently of the state of the program" is very unclear to me. It
doesn't fit the C++ abstract machine context at all.
P2723R1 suggests one main motivation is to prevent a type of security hole
where one function writes "sensitive" data to the stack, then after that
function has popped and another function is using the same region of stack
memory, it might end up sending those old sensitive byte values to an
output, via reading uninitialized variables, padding bytes, or bytes in a
union but not in the active member. That might explain why the changes are
just for automatic storage, not for dynamic storage: it's still possible
old sensitive data exists in heap storage, but maybe it's harder for an
exploit to get at that sort of memory, which is reused in much less
localized ways. If that's what's intended, one requirement that could work
in terms of the abstract machine would be that (if the implementation
doesn't prevent the execution entirely because of the EB,) every instance
of storage associated with the same automatic block variable's definition
initially has the same erroneous values, even for different executions of
the same program. But in terms of [intro.abstract], this is not
"implementation-defined behavior" if we want to allow optimizations other
than always using zero or some other defined byte pattern. It is not
"unspecified behavior", since that's the context when [intro.abstract] says
"An instance of the abstract machine can thus have more than one possible
execution for a given program and a given input."
-- Andrew Schepler
Received on 2025-03-15 02:46:22