For anyone reading this in gmail, please click on the ... in my previous response, as gmail hides the forwarded portion of the message.

On Wed, 6 May 2020 at 11:59, Matthew Bentley <mattreecebentley@gmail.com> wrote:
> Thanks again, and once more, take anything I don't comment on as accepted criticism/explanation.

Your response didn't go to the list.  I've added the list again.

Ah, whoops - thanks. But you added the incorrect list address- correcting here.

You either need to say which constraints a user-supplied type for Skipfield
is required to satisfy (note it's a template parameter, so the user can
choose), or the template parameter must go.  Currently, the wording is
silent on any constraints for "Skipfield".

Torn here between obviating future improvements and being specific. It needs to stay in order for current implementations to be useful to, in particular, embedded folk, so it would be an unsigned integer type. If future implementations don't use it or use something different, it can be gracefully deprecated I guess.
 

I haven't yet understood the "reverse iterator", --end(), empty container,
and undefined behavior train of thought.  I believe you can only do --end()
on a non-empty container, as a matter of precondition for that operation.


I don't think the C++ spec specifies that iterators have to be bounded?
But certainly including bounding creates performance detriment and should be optional.
In a bounded iterator --end() would be non-problematic, in a non-bounded would result in undefined behaviour.

 

This area probably needs a bit more thought / design work.
It does not seem to be a frequent operation to change min/max
mid-way during your container's lifetime, so offering a small
interface surface here is probably a good idea.

I'm going to look into whether or not specifying min/max limits for block capacities without changing the capacities of existing blocks can have any adverse effects ie. if block 1 is capacity 4, and set_block_capacity_limits sets min capacity to 10, whether leaving block 1 at capacity 4 could have any adverse effects.
If not, will re-implement so that it's a simple "set variables" operation without any side effects.
If so, will re-implement so that set_block_capacity_limits doesn't include the reallocation code when the type is non-copyable/movable (via constexpr and type_traits).

 

Some ABIs can return trivially-copyable structs of sufficiently small size
in registers.  The structured binding above has a good chance that no
stack memory will ever be used, because min/max will be kept in a register
in the caller, too.

If you pass-by-reference, ABIs generally convert that into a pointer,
i.e. the caller needs to pass a memory region and later needs to load
the data from the filled memory region.  That is more work.


I tried implementing it your way, but the code-smell got bad pretty quickly.
You end up with code like:
plf::colony<int, my_allocator_type>::minmax block_sizes;
block_sizes = i_colony.get_block_capacity_limits();

You have to make the struct internal to colony otherwise you can't guarantee that the allocator will supply the same size_type to block_sizes as it does to colony. You could just go with size_t I guess. But the main issues were with using it with constructors, where you end up with this:
plf::colony<int, my_allocator_type>::minmax block_sizes = {50, 500};
plf::colony<int, my_allocator_type> i(block_sizes, my_allocator_instance);

or worse:

plf::colony<int, my_allocator_type> i(plf::colony<int, my_allocator_type>::minmax block_sizes(50, 500), my_allocator_instance);

I tried passing via a bracketed pair of ints, but of course that makes it go into the initializer_list constructor instead.

I think in this instance, the way I've done it is best, or results in less ugly code for the end user. I think the frequency of usage of get_block_capacity_limits does not justify creation of a struct for this purpose. The standard does use pass-by-reference for return parameters, in places - eg. istream& getline - so it's not a big deal I feel.

 
>>  Any sorting near O(n^2) is not worth talking about, so you should simply
>>prescribe O(n log n) with a "remarks" paragraph that this might allocate
>> memory (which might cause std::bad_alloc to be thrown).
>
> The time complexity doesn't relate to the speed of sorting (if it did, std::list sorting would be as fast as std::vector sorting). The schema I'm discussing simply allows for faster sort algorithms which rely on random access, as opposed to relying on a slower algorithm which doesn't ala list.

I'm guessing that std::list::sort also does some tricks with an
intermediate pointer array, so I'm not seeing the point.

It doesn't - it does a basic merge-sort in all implementations I've seen. I pretty much came up with the pointer-sort dereferencing hack (https://plflib.org/list.htm#details) - or at least I haven't seen it elsewhere - that's why plf::list sorting is 77% faster than std::list (see: https://plflib.org/benchmarks_haswell_gcc.htm#sort).
From my old reading of gcc/boost posts, I think the libstdC++ people are pretty against allocating memory during a sort.

 
> Because that's an edge case outside of get_index_from_iterator and results in performance loss in general use when you specialize for it.

One missed branch, I hope?

More than one.
 

Anyway, there seems to be rationale for going beyond std::distance(begin(), it)
that should be recorded in the "Design Decisions" section.

Fair enough.

 
> I don't think that's not the right place for the explanation, the right location is in Design decisions.
> You wouldn't expect an explanation of how a map/vector/deque works, or why some approaches allow for better performance, to be included in the tech spec, so I don't  think you can expect the same of this.

std::deque doesn't allow me to specify block size parameters
from the outside, so I don't care on the specification level.

The interface for std::colony that you offered does allow me to
influence min/max, so you need to normatively specify what such
influence does, otherwise the parameter is meaningless and
therefore needs to go.

I don't think there are abstract explanations for all the parameters which the other containers are able to influence either, at least not to the depth you're talking about. Prove me wrong?

 
>
>     The C++ abstract machine has no way of expressing "cache locality", btw.
>
>
> Not really my concern. I'm interested in high-performance containers, not the C++ abstract machine.

Then you're in the wrong forum here.

Not really. I think the C++ abstract machine is based on some fairly old understandings which don't hold true anymore ie. that time complexity is strongly relevant in some capacity (latency, overall performance) which it isn't outside of large N or large sizeof(N). That being the case, it's not something I should overly concern myself with, as the computers it was designed for no longer exist.



My understanding is that std::colony does not provide random-access iterators.
Only those have relational operators, though, in the general scheme of
iterator operations [random.access.iterators].

There is no specification that the iterators std::colony provides have
relational operators.

There has been no pushback so far from colony having bidirectional iterators with additional relational operators. The iterators have always been specified as bidirectional. If you have a rational objection I'd be interested in hearing it.

 
> Okay, you're trying to get me to move some prose to the tech spec and some to the design decisions. Is there a guide as to what prose belongs where?

Anything explaining "why" should go to the "Design Decisions".

Anything allowing to judge whether my independent (assume bad-faith) implementation
of std::colony is conforming or not (given arbitrarily hostile template parameters
and arbitrarily hostile sequences of operations) needs to into the specification.

Example:
As I said above, "cache-locality" (for instance) is nothing you can talk about
in the C++ standard (because the abstract machine doesn't provide vocabulary
for it), but you might want to talk about only allowing at most one memory
allocation (observable via the Allocator template parameter) per "min"
inserts or so.

Okay. Will contemplate that.
Thanks once again,
Matt