On Fri, 21 Jul 2023, 20:41 Ville Voutilainen, <ville.voutilainen@gmail.com> wrote:
On Fri, 21 Jul 2023 at 22:36, Jonathan Wakely via Std-Proposals
<std-proposals@lists.isocpp.org> wrote:
>
>
>
> On Fri, 21 Jul 2023, 18:16 Arthur O'Dwyer via Std-Proposals, <std-proposals@lists.isocpp.org> wrote:
>>
>> On Fri, Jul 21, 2023 at 12:59 PM Jerry Coffin via Std-Proposals <std-proposals@lists.isocpp.org> wrote:
>>>
>>> There's a corner case for which the current specification of `std::string::append` will frequently lead to undefined behavior.
>>>
>>> Consider something like:
>>>
>>> ```cpp
>>> std::strings = "A long enough string that two copies of it probably won't fit into the currently allocated storage";
>>> s.append(s);
>>> ```
>>
>>
>> Even more natural:
>>     s += s;
>> which is equivalent to
>>     s.append(s)
>> which is equivalent to
>>     s.append(s.data(), s.size())
>> which has the following effects ([string.append]/8):
>> > Appends a copy of the range [s, s + n) to the string.
>>
>> However — it is not clear to me that this wording gives the implementation permission to append anything other than [s, s+n) to the string in the case that the appending operation itself causes [s, s+n) to become invalidated.
>
>
> This is absolutely required to work. The string can detect that the argument aliases its current content and deal with it.
>
> It also has to work for s.insert(pos1, s, pos2, n) and s.replace(pos1, n1, s, pos2, n2) which is trickier, but has to work.
>
> The only time the implementation doesn't have to deal with this is if the string parameters is an rvalue reference, in which case it can assume no aliasing.

Doesn't the proper handling of this case almost fall out from the
exception safety requirements? You'll have to allocate the new buffer
before invalidating
the old one anyway in order to achieve a strong exception safety
guarantee, and then it would be rather hostile to invalidate the old
buffer before
copying the incoming range to the new buffer, especially because you
still need the old buffer to copy *that* to the new buffer too.. :D

You *could* allocate, copy the old string into the new storage, deallocate old storage, then copy the argument into the new storage (which can't throw because you're just copying trivial types). That wouldn't work, and is no easier than just doing it so it works properly, so there's no reason to do it.

As Andrey said, the reallocation case is easier to get right then to get wrong, but the non-reallocating case needs care. And replacing part of the string with itself here really tricky.


So, the spec could perhaps use some clarity improvements,

The standard is not a tutorial ;-)

Like Arthur said, the spec says it does a copy of the range. There's no allowance for "unless that range overlaps the string", it just has to do it.



but it seems
to me that the correct handling of this case almost just.. ..falls out
from
the other requirements?