sg14: Re: [SG14] Colony Reserve update

From: Klaim - Joël Lamotte <mjklaim_at_[hidden]>
Date: Thu, 6 Aug 2020 14:26:22 +0200

Hi, I have some related suggestions:
1. Was this considered as an alternative: have a "reserve(int)" function
that adds a new block "manually" to the chain of blocks, with at least
enough space for the required number of elements (not counting the previous
elements or blocks).
I think it would be named differently from reserve, so maybe
"reserve_block(int)".
That way:
- it's basically what insert will do anyway when growing the container
(with some constraints), it does not need to add conditions to the current
implementations (I believe, I might be wrong);
- we can achieve being able to pre-allocate outside of the hot-loops, in
particular if we call it only once after creation.
- it gives control to the users who need to be able to dynamically grow
the container but also have the information allowing them to anticipate how
to grow it (because reserve_block() can be called again if there is an
incoming set of data that will need to be added).

2. If such reserve_block(int) function was provided, then would it be
interesting to have a "shrink_to_last()" function that would only remove
blocks at the end of the chain IFF they are empty. The last block which
contains elements would be untouched.
That way:
- implementers are free to keep the blocks allocated until that function
is explicitly called, no way to end-up in the issue of deleting a block
then recreating one because the user removed an element then added one;
- It allows the user fine-enough control over the allocation to be used
in (I believe) even extreme contexts.

I think these propositions would help achieve the goals .reserve() and
.shrink_to_fit() achieve for other containers without the performance
issues and while still making sense with the semantic of colony.
But I might be wrong, and I don't know if these ideas have been discussed
already?

A. Joël Lamotte

On Wed, 5 Aug 2020 at 00:28, Matthew Bentley via SG14 <sg14_at_[hidden]>
wrote:

> Just to retroactively invalidate my previous statement, I've found that
> GCC 8 and MSVC don't have the same 10% performance detriment as GCC9. I'm
> in the middle of doing more cross-compiler testing to be thorough on this.
> I suspect that the 2% performance difference is likely to hold true, but
> will have to see. I would be interested to know what sort of % performance
> threshold people would tolerate in order to have a working reserve function.
>
> In the meantime I do one other question for people:
> If we allow unused empty memory blocks (which is what reserve ensures),
> that means we can retain memory blocks when all elements in them are
> erased, rather than deallocating the blocks. This has numerous potential
> performance advantages, for example if you have filled all element memory
> blocks and then insert a new element, a new memory block will be created.
> If that element is subsequently erased and then another inserted, on
> repeat, without retaining the memory block you would end up deallocating
> and reallocating a memory block repeatedly.
>
> However it also throws up a lot of implementation-defined decision-making
> as to the capacity of memory blocks which will be retained. For example, we
> would almost always want to retain the back block as this will be the
> largest - and similarly we might want to retain any other blocks which are
> at the maximum memory block size. But if there's twenty memory blocks, and
> the front memory block gets all it's elements erased, we wouldn't want to
> retain it, as it's capacity would be small, which leads to worse cache
> locality. Re-using that block would be a detriment to performance.
>
> Also, retaining erased memory blocks leads to a vector-like scenario where
> the user can have a lot of unused memory after erasures, until a
> shrink-to-fit is called (I just remembered from the june talk that we
> established that shrink-to-fit would be a non-element-reallocating "free
> unused blocks" function much like std::deque, in this case). This has the
> advantage of allowing the user to move a lot of deallocation to
> non-hot-loop areas, but how much deallocation would occur in the hot loops
> themselves would be dependent on the implementation as mentioned in the
> previous paragraph.
>
> Are either of these two aspects problematic, if we retain reserve?
>
>
> Now, replying to Ben:
>
> On Wed, 5 Aug 2020 at 09:23, Ben Craig via SG14 <sg14_at_[hidden]>
> wrote:
>
>> I believe your results, but I'm trying to understand the "why" a bit
>> better. Does the addition of reserve() have knock-on effects on the rest
>> of the design of the container? Perhaps because it makes prior, simple
>> invariants more complicated?
>>
>
> It introduces new code and branches into all the various
> insert/emplace/erase/splice functions, and of course there are minor
> changes to constructors/swap/etc as well, to re-use or retain empty memory
> blocks. As above, I need to do more vigorous cross-compiler testing to
> establish the true effect of this - to find out whether what I've seen with
> GCC9 is merely a bug in it's optimization.
>
>
>>
>> How does the container differ in the following cases:
>> * Reservable design: Start empty, reserve(512) elements, insert 100
>> elements
>> * Non-reservable design: Start empty, insert 512 elements, erase 412
>> elements
>> * Reservable design: Start empty, insert 100 elements
>> * Non-reservable design: Start empty, insert 100 elements
>
>
> Basically as-noted, extra code and branching occuring. In case you're
> interested, I've attached the before (reserve can only allocate one block)
> and after (incomplete, reserve can allocate many blocks, erase can retain
> memory blocks) variants to this email, may not be worth investigating
> further until I've completed my investigation however.
> _______________________________________________
> SG14 mailing list
> SG14_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/sg14
>

Received on 2020-08-06 07:30:23