> Torn here between obviating future improvements and being specific. It needs to stay in order for current implementations to be useful to, in particular, embedded folk, so it would be an unsigned integer type. If future implementations don't use it or use something different, it can be gracefully deprecated I guess.

We have the usual cognitive issue of implementation vs. specification here.
I have no objection that your implementation feels that a Skipfield template
parameter is a good idea here. However, your implementation is not the
specification. The specification is what goes into the standard text wording
and which (I presume) you strive to provided in your "Technical
Specification" section.

The purpose of that wording is to allow independent implementations
to exist, and to judge whether such independent implementations are
conforming ("good") or not. It's not dissimilar to some legal text,
which allows to judge whether some real-world action is objectionable
or not. That means, the wording must be self-contained, unambiguous
and complete.

The end goal is to allow porting of user programs from one implementation
to another. The user program needs to know which features it can rely
on, by carefully inspecting the standard wording only.

This e-mail exchange is solely about getting the "Technical Specification"
as complete and unambiguous and generally high-quality as possible.

Note that implementations (in general) are allowed to add additional
(defaulted) template parameters, I believe, as a conforming extension.

So, back to "Skipfield". If the "Skipfield" template parameter is mentioned
in the specification anywhere, it establishes a requirement on independent
implementations to provide such a Skipfield parameter. However, what
should they do with that parameter? Just ignore it? If you don't tell
implementations in the wording, they won't know.

In the interest of clarity, let me state I will strongly oppose progressing
this paper at any level (SG14, LEWG, LWG, WG21, CD, DIS) with such known
glaring holes in the wording. (My expectation is, though, that simply
pointing out the hole to LWG will stop it in its tracks right there.
See https://cplusplus.github.io/LWG/lwg-active.html#3213 and
https://cplusplus.github.io/LWG/lwg-active.html#3439 where we discovered
similar holes a bit late.)

As noted, part of my submitting this paper to SG14 is to try and find opinions and solutions to such ambiguities. For the next revision I'll try to include wording at the front of the tech spec which shows the explicitness of the multiple-memory block model, capacity limits + the relationship to the skipfield_type parameter. But in the interests of SG14 and it's members, including embedded, the whole thing is of less worth without the skipfield_type or capacity limiting capability - partially because capacity limiting allows developers to customise block sizes for a specific cache.

> I haven't yet understood the "reverse iterator", --end(), empty container,
> and undefined behavior train of thought. I believe you can only do --end()
> on a non-empty container, as a matter of precondition for that operation.
>
>
> I don't think the C++ spec specifies that iterators have to be bounded?
> But certainly including bounding creates performance detriment and should be optional.
> In a bounded iterator --end() would be non-problematic, in a non-bounded would result in undefined behaviour.

What do you mean by "bounded"?

Some implementations such as MSVC have optional bounded iterators which will refuse to go before begin() or after end().

> I'm going to look into whether or not specifying min/max limits for block capacities without changing the capacities of existing blocks can have any adverse effects ie. if block 1 is capacity 4, and set_block_capacity_limits sets min capacity to 10, whether leaving block 1 at capacity 4 could have any adverse effects.
> If not, will re-implement so that it's a simple "set variables" operation without any side effects.
> If so, will re-implement so that set_block_capacity_limits doesn't include the reallocation code when the type is non-copyable/movable (via constexpr and type_traits).

That seems a step in the wrong direction. If you want to offer an
interface that does not reallocate, but just sets the new min/max
parameters (and thus has different requirements on the contained T),
that's fine.

However, that's a different operation (and thus should get a different name)
compared to an operation that does perform the reallocation and thus
has a few more requirements on T.

The way I've got it implemented in a test implementation, is that it's the same interface, but if the type is non-copyable/movable, and the container isn't empty, it returns an error code. It would essentially move the responsibility from the interface to the programmer to clear() the container first if they're (a) using a non-copyable/movable type (b) they want to change the capacities but (c) have already inserted elements. It would also return an error code in this scenario if the type was copyable/movable but not empty and it failed in it's allocation of the memory for the new memory blocks to reallocate to. Alternatively it could throw instead of returning an error code, I have no strong feelings either way on that one. Your feeling?

I think it's better to do this than simply set the values and assume that these are for future block capacities only, as there is no way for the programmer to know what blocks currently use which capacities, and hence no way for them to make a decision on that basis to reallocate from existing blocks.

> I tried implementing it your way, but the code-smell got bad pretty quickly.
> You end up with code like:
> plf::colony<int, my_allocator_type>::minmax block_sizes;
> block_sizes = i_colony.get_block_capacity_limits();

auto block_sizes = i_colony.get_block_capacity_limits();

I'm aware of that but my personal bugbear is to try and discourage the use of auto, it moves the required knowledge of what type you're using from the code to an external function definition, which isn't great for readability in my view.

> You could just go with size_t I guess.

Would it be a problem if the outward-facing interface would always use size_t here?
Block sizes > 2^32 should be rare anyway, right?

I think so. From my measurements, there's no performance advantage to be had in almost any scenario from having block sizes larger than 65535, but I can't predict a potentially better future implementation, so *shrug*.

I'm not sure I like the "runtime failure if overflow" approach here, though.
As opposed to forwarding the Allocator's size_type in the interface.

You could, I suppose, template the struct and supply the allocator direct to it, but then, there's no 100% guarantees.

> It doesn't - it does a basic merge-sort in all implementations I've seen. I pretty much came up with the pointer-sort dereferencing hack (https://plflib.org/list.htm#details) - or at least I haven't seen it elsewhere - that's why plf::list sorting is 77% faster than std::list (see: https://plflib.org/benchmarks_haswell_gcc.htm#sort).
> From my old reading of gcc/boost posts, I think the libstdC++ people are pretty against allocating memory during a sort.

Hm.

Does your "allocate a pointer array" approach in any way depend on the
internals of std::colony?

If not, maybe the right approach is to offer a stand-alone, self-contained
algorithm "indirect_sort" or so that can sort any range of forward iterators
and that expressly allocates memory, instead of providing a specialized
member function just for this single container.

So long as that allowed for specializations for a given container, that might work,

but the main reason std::list has it's own internal sort function is because it only changes the next/previous pointers, no elements get reallocated - unlike colony and most other containers.

You can of course do the same pointer technique with the next/previous pointers of std::list, which is more or less what I did with plf::list, but that requires a specialization.

> I don't think there are abstract explanations for all the parameters which the other containers are able to influence either, at least not to the depth you're talking about. Prove me wrong?

Proving a "for all" theorem over prose text is hard, of course.
Let's turn this around:

If you find an example of an unspecified parameter in the existing standard text,
please tell me about it, because that needs to become an LWG issue urgently.
(For example, std::deque also has a block size internally, but it is never
exposed in the interface.)

Okay, so the precondition for inclusion in the tech spec is that it is exposed in the interface. Good to know and fair enough.

> Not really. I think the C++ abstract machine is based on some fairly old understandings which don't hold true anymore ie. that time complexity is strongly relevant in some capacity (latency, overall performance) which it isn't outside of large N or large sizeof(N). That being the case, it's not something I should overly concern myself with, as the computers it was designed for no longer exist.

The O(N or whatever) complexities given in the standard library descriptions
concern observable behavior of the algorithms, e.g. the number of invocations
of the copy constructor or the number of invocations of a comparison.

You can observe by simply counting the number of calls in your container
element type T.

My point is that the reasoning as to why the standard focuses on these things seems to be that they have an impact on performance in some way.

If they don't, it's hard to make a case for relevance. Why would you care about the number of calls, if the number of calls has no discernable impact on runtime behaviour? Or is there some other overarching reason to record about the number of calls?

From my point of view, if the relevance is performance-related then their relevance is greatly diminished compared to other factors. It seems to be a gap in vision to not see that the relevance of these things is as hardware-dependent as almost anything else. The greater the gap between memory and processor speeds becomes, the less relevance time complexity has on performance.

If you believe the C++ abstract machine description is no longer fit for its purpose,
you're very welcome to propose an update to EWG/CWG. Note that the entire
state-of-the-art of compiler optimizations critically depends on the as-if
rule formulated in [intro.abstract], so please tread carefully.
(Allowing to specify something like P1315 secure_clear is in the same realm.)

My point is not-so-much that it's not useful, just that the necessary gap between it and realworld machines means that not everything which is pertinent to allowing a given construct to exist, in a given way, is going to be expressible within the language of the abstract machine. If cache locality were to be part of the language of the abstract - that would also be a mistake. Because it's a realworld constraint which may change in 30 years. See my point above about time complexity. That *was* very relevant to performance, once.

But that means I can't justify or express everything within the scope of that language.

All I'm saying is that the "technical specification" doesn't say that
std::colony's iterators have relational operators, so independent
implementations won't provide them, so portable user programs can't use
them. If that's not your desired end-state, the "technical specification"
needs to say something about relational operators, and probably specify
their semantics (because you can't use the RandomAccessIterator
specification framework, obviously).

Okay cool, will do.

Thanks again,

Matt