C++ Logo

std-proposals

Advanced search

Re: [std-proposals] When does vector::push_back() invalidation happen?

From: Jonathan Wakely <cxx_at_[hidden]>
Date: Sun, 31 Mar 2024 19:38:31 +0100
On Sun, 31 Mar 2024, 18:51 Avi Kivity, <avi_at_[hidden]> wrote:

> On Sun, 2024-03-31 at 18:28 +0100, Jonathan Wakely via Std-Proposals wrote:
>
>
>
> On Sun, 31 Mar 2024, 16:52 Thiago Macieira via Std-Proposals, <
> std-proposals_at_[hidden]> wrote:
>
> On Sunday, 31 March 2024 08:22:05 PDT Avi Kivity via Std-Proposals wrote:
> > If it
> > reallocates after the push_back(), and the move constructor can throw,
> > then we lose the strong exception guarantee.
>
> It's impossible to reallocate after, because the vector must create space
> to
> store the new element. It can't store the element where there's no room.
>
> > The standard says:
> > > Remarks: Causes reallocation if the new size is greater than the old
> >
> > capacity. Reallocation invalidates all the references, pointers, and
> > iterators referring to the elements in the sequence, as well as the
> > past-the-end iterator.
> >
> > So, it says nothing about whether a push_back referring to a vector
> > element is legal.
> >
> > Is this undefined behavior? Should it be specified to work? Should it
> > be noted that it is dangerous?
>
> Looks pretty clear to me: if the new size is going to be bigger than the
> previous capacity, then it invalidates and therefore the reference stored
> by
> binding the parameter to v.back() is dangling. It's clearly UB to
> dereference
> it and therefore not expected to work.
>
>
> No, it's guaranteed to work. v.push_back(std::move(v.back())) doesn't have
> to work, but Avi's example does. The standard says the argument is pushed
> back, it doesn't say "unless it aliases the container".
>
> This has been well understood and thoroughly tested by implementers for
> decades.
>
>
>
>
> Yes, it was reported to me that libstdc++'s std::vector works.
>
> But, it's not clear from the wording. I'm sure that if the move
> constructor accesses the vector (via some side channel) it won't work.
>

That's undefined, you can't access the vector reentrantly. While the
push_back call is actively modifying the container, another call to it is
undefined.

We try to say this in https://eel.is/c++draft/reentrancy but we completely
fail to specify anything clearly. There's been an open library issue about
this for years.




To be clear the standard has to define the ordering between the new T's
> construction and the resize.
>

The resize is not atomic. New memory is allocated, then elements are
constructed, then the old elements destroyed, and the old memory
deallocated.

It's guaranteed that the right value will be correctly inserted (so it has
to be copied before old elements are modified) but I don't agree that more
than that needs to be specified.

In practice, the only way to make it work is to construct the pushed back
element as soon as new memory has been allocated (and before any other
elements have been moved/copied from them old storage) but there's no
guarantee whether the vector's size() or capacity() is "correct" at that
point, or whether begin() and data() point to the old storage or the new.
Its invariants are not required to hold while it's in the middle of being
modified by a non-constant member function. In practice, all those member
functions probably return the old values until after all elements have been
constructed in the new storage, but if for example your allocator tried to
inspect the vector when the old storage was being deallocated, there's no
guarantee what state you'd observe. If your code needs to do that, you're
probably doing something wrong.

Received on 2024-03-31 18:39:51