Date: Wed, 6 May 2020 11:59:21 +1200
> Thanks again, and once more, take anything I don't comment on as accepted
criticism/explanation.
>
> Your response didn't go to the list. I've added the list again.
>
Ah, whoops - thanks. But you added the incorrect list address- correcting
here.
You either need to say which constraints a user-supplied type for Skipfield
> is required to satisfy (note it's a template parameter, so the user can
> choose), or the template parameter must go. Currently, the wording is
> silent on any constraints for "Skipfield".
>
Torn here between obviating future improvements and being specific. It
needs to stay in order for current implementations to be useful to, in
particular, embedded folk, so it would be an unsigned integer type. If
future implementations don't use it or use something different, it can be
gracefully deprecated I guess.
>
> I haven't yet understood the "reverse iterator", --end(), empty container,
> and undefined behavior train of thought. I believe you can only do --end()
> on a non-empty container, as a matter of precondition for that operation.
>
>
I don't think the C++ spec specifies that iterators have to be bounded?
But certainly including bounding creates performance detriment and should
be optional.
In a bounded iterator --end() would be non-problematic, in a non-bounded
would result in undefined behaviour.
>
> This area probably needs a bit more thought / design work.
> It does not seem to be a frequent operation to change min/max
> mid-way during your container's lifetime, so offering a small
> interface surface here is probably a good idea.
>
I'm going to look into whether or not specifying min/max limits for block
capacities without changing the capacities of existing blocks can have any
adverse effects ie. if block 1 is capacity 4, and set_block_capacity_limits
sets min capacity to 10, whether leaving block 1 at capacity 4 could have
any adverse effects.
If not, will re-implement so that it's a simple "set variables" operation
without any side effects.
If so, will re-implement so that set_block_capacity_limits doesn't include
the reallocation code when the type is non-copyable/movable (via constexpr
and type_traits).
>
> Some ABIs can return trivially-copyable structs of sufficiently small size
> in registers. The structured binding above has a good chance that no
> stack memory will ever be used, because min/max will be kept in a register
> in the caller, too.
>
> If you pass-by-reference, ABIs generally convert that into a pointer,
> i.e. the caller needs to pass a memory region and later needs to load
> the data from the filled memory region. That is more work.
>
>
I tried implementing it your way, but the code-smell got bad pretty quickly.
You end up with code like:
plf::colony<int, my_allocator_type>::minmax block_sizes;
block_sizes = i_colony.get_block_capacity_limits();
You have to make the struct internal to colony otherwise you can't
guarantee that the allocator will supply the same size_type to block_sizes
as it does to colony. You could just go with size_t I guess. But the main
issues were with using it with constructors, where you end up with this:
plf::colony<int, my_allocator_type>::minmax block_sizes = {50, 500};
plf::colony<int, my_allocator_type> i(block_sizes, my_allocator_instance);
or worse:
plf::colony<int, my_allocator_type> i(plf::colony<int,
my_allocator_type>::minmax block_sizes(50, 500), my_allocator_instance);
I tried passing via a bracketed pair of ints, but of course that makes it
go into the initializer_list constructor instead.
I think in this instance, the way I've done it is best, or results in less
ugly code for the end user. I think the frequency of usage of
get_block_capacity_limits does not justify creation of a struct for this
purpose. The standard does use pass-by-reference for return parameters, in
places - eg. istream& getline - so it's not a big deal I feel.
> >> Any sorting near O(n^2) is not worth talking about, so you should
> simply
> >>prescribe O(n log n) with a "remarks" paragraph that this might allocate
> >> memory (which might cause std::bad_alloc to be thrown).
> >
> > The time complexity doesn't relate to the speed of sorting (if it did,
> std::list sorting would be as fast as std::vector sorting). The schema I'm
> discussing simply allows for faster sort algorithms which rely on random
> access, as opposed to relying on a slower algorithm which doesn't ala list.
>
> I'm guessing that std::list::sort also does some tricks with an
> intermediate pointer array, so I'm not seeing the point.
>
It doesn't - it does a basic merge-sort in all implementations I've seen. I
pretty much came up with the pointer-sort dereferencing hack (
https://plflib.org/list.htm#details) - or at least I haven't seen it
elsewhere - that's why plf::list sorting is 77% faster than std::list (see:
https://plflib.org/benchmarks_haswell_gcc.htm#sort).
>From my old reading of gcc/boost posts, I think the libstdC++ people are
pretty against allocating memory during a sort.
> > Because that's an edge case outside of get_index_from_iterator and
> results in performance loss in general use when you specialize for it.
>
> One missed branch, I hope?
>
More than one.
>
> Anyway, there seems to be rationale for going beyond
> std::distance(begin(), it)
> that should be recorded in the "Design Decisions" section.
>
Fair enough.
> > I don't think that's not the right place for the explanation, the right
> location is in Design decisions.
> > You wouldn't expect an explanation of how a map/vector/deque works, or
> why some approaches allow for better performance, to be included in the
> tech spec, so I don't think you can expect the same of this.
>
> std::deque doesn't allow me to specify block size parameters
> from the outside, so I don't care on the specification level.
>
> The interface for std::colony that you offered does allow me to
> influence min/max, so you need to normatively specify what such
> influence does, otherwise the parameter is meaningless and
> therefore needs to go.
>
I don't think there are abstract explanations for all the parameters which
the other containers are able to influence either, at least not to the
depth you're talking about. Prove me wrong?
> >
> > The C++ abstract machine has no way of expressing "cache locality",
> btw.
> >
> >
> > Not really my concern. I'm interested in high-performance containers,
> not the C++ abstract machine.
>
> Then you're in the wrong forum here.
>
Not really. I think the C++ abstract machine is based on some fairly old
understandings which don't hold true anymore ie. that time complexity is
strongly relevant in some capacity (latency, overall performance) which it
isn't outside of large N or large sizeof(N). That being the case, it's not
something I should overly concern myself with, as the computers it was
designed for no longer exist.
My understanding is that std::colony does not provide random-access
> iterators.
> Only those have relational operators, though, in the general scheme of
> iterator operations [random.access.iterators].
>
> There is no specification that the iterators std::colony provides have
> relational operators.
>
There has been no pushback so far from colony having bidirectional
iterators with additional relational operators. The iterators have always
been specified as bidirectional. If you have a rational objection I'd be
interested in hearing it.
> > Okay, you're trying to get me to move some prose to the tech spec and
> some to the design decisions. Is there a guide as to what prose belongs
> where?
>
> Anything explaining "why" should go to the "Design Decisions".
>
> Anything allowing to judge whether my independent (assume bad-faith)
> implementation
> of std::colony is conforming or not (given arbitrarily hostile template
> parameters
> and arbitrarily hostile sequences of operations) needs to into the
> specification.
>
> Example:
> As I said above, "cache-locality" (for instance) is nothing you can talk
> about
> in the C++ standard (because the abstract machine doesn't provide
> vocabulary
> for it), but you might want to talk about only allowing at most one memory
> allocation (observable via the Allocator template parameter) per "min"
> inserts or so.
>
Okay. Will contemplate that.
Thanks once again,
Matt
criticism/explanation.
>
> Your response didn't go to the list. I've added the list again.
>
Ah, whoops - thanks. But you added the incorrect list address- correcting
here.
You either need to say which constraints a user-supplied type for Skipfield
> is required to satisfy (note it's a template parameter, so the user can
> choose), or the template parameter must go. Currently, the wording is
> silent on any constraints for "Skipfield".
>
Torn here between obviating future improvements and being specific. It
needs to stay in order for current implementations to be useful to, in
particular, embedded folk, so it would be an unsigned integer type. If
future implementations don't use it or use something different, it can be
gracefully deprecated I guess.
>
> I haven't yet understood the "reverse iterator", --end(), empty container,
> and undefined behavior train of thought. I believe you can only do --end()
> on a non-empty container, as a matter of precondition for that operation.
>
>
I don't think the C++ spec specifies that iterators have to be bounded?
But certainly including bounding creates performance detriment and should
be optional.
In a bounded iterator --end() would be non-problematic, in a non-bounded
would result in undefined behaviour.
>
> This area probably needs a bit more thought / design work.
> It does not seem to be a frequent operation to change min/max
> mid-way during your container's lifetime, so offering a small
> interface surface here is probably a good idea.
>
I'm going to look into whether or not specifying min/max limits for block
capacities without changing the capacities of existing blocks can have any
adverse effects ie. if block 1 is capacity 4, and set_block_capacity_limits
sets min capacity to 10, whether leaving block 1 at capacity 4 could have
any adverse effects.
If not, will re-implement so that it's a simple "set variables" operation
without any side effects.
If so, will re-implement so that set_block_capacity_limits doesn't include
the reallocation code when the type is non-copyable/movable (via constexpr
and type_traits).
>
> Some ABIs can return trivially-copyable structs of sufficiently small size
> in registers. The structured binding above has a good chance that no
> stack memory will ever be used, because min/max will be kept in a register
> in the caller, too.
>
> If you pass-by-reference, ABIs generally convert that into a pointer,
> i.e. the caller needs to pass a memory region and later needs to load
> the data from the filled memory region. That is more work.
>
>
I tried implementing it your way, but the code-smell got bad pretty quickly.
You end up with code like:
plf::colony<int, my_allocator_type>::minmax block_sizes;
block_sizes = i_colony.get_block_capacity_limits();
You have to make the struct internal to colony otherwise you can't
guarantee that the allocator will supply the same size_type to block_sizes
as it does to colony. You could just go with size_t I guess. But the main
issues were with using it with constructors, where you end up with this:
plf::colony<int, my_allocator_type>::minmax block_sizes = {50, 500};
plf::colony<int, my_allocator_type> i(block_sizes, my_allocator_instance);
or worse:
plf::colony<int, my_allocator_type> i(plf::colony<int,
my_allocator_type>::minmax block_sizes(50, 500), my_allocator_instance);
I tried passing via a bracketed pair of ints, but of course that makes it
go into the initializer_list constructor instead.
I think in this instance, the way I've done it is best, or results in less
ugly code for the end user. I think the frequency of usage of
get_block_capacity_limits does not justify creation of a struct for this
purpose. The standard does use pass-by-reference for return parameters, in
places - eg. istream& getline - so it's not a big deal I feel.
> >> Any sorting near O(n^2) is not worth talking about, so you should
> simply
> >>prescribe O(n log n) with a "remarks" paragraph that this might allocate
> >> memory (which might cause std::bad_alloc to be thrown).
> >
> > The time complexity doesn't relate to the speed of sorting (if it did,
> std::list sorting would be as fast as std::vector sorting). The schema I'm
> discussing simply allows for faster sort algorithms which rely on random
> access, as opposed to relying on a slower algorithm which doesn't ala list.
>
> I'm guessing that std::list::sort also does some tricks with an
> intermediate pointer array, so I'm not seeing the point.
>
It doesn't - it does a basic merge-sort in all implementations I've seen. I
pretty much came up with the pointer-sort dereferencing hack (
https://plflib.org/list.htm#details) - or at least I haven't seen it
elsewhere - that's why plf::list sorting is 77% faster than std::list (see:
https://plflib.org/benchmarks_haswell_gcc.htm#sort).
>From my old reading of gcc/boost posts, I think the libstdC++ people are
pretty against allocating memory during a sort.
> > Because that's an edge case outside of get_index_from_iterator and
> results in performance loss in general use when you specialize for it.
>
> One missed branch, I hope?
>
More than one.
>
> Anyway, there seems to be rationale for going beyond
> std::distance(begin(), it)
> that should be recorded in the "Design Decisions" section.
>
Fair enough.
> > I don't think that's not the right place for the explanation, the right
> location is in Design decisions.
> > You wouldn't expect an explanation of how a map/vector/deque works, or
> why some approaches allow for better performance, to be included in the
> tech spec, so I don't think you can expect the same of this.
>
> std::deque doesn't allow me to specify block size parameters
> from the outside, so I don't care on the specification level.
>
> The interface for std::colony that you offered does allow me to
> influence min/max, so you need to normatively specify what such
> influence does, otherwise the parameter is meaningless and
> therefore needs to go.
>
I don't think there are abstract explanations for all the parameters which
the other containers are able to influence either, at least not to the
depth you're talking about. Prove me wrong?
> >
> > The C++ abstract machine has no way of expressing "cache locality",
> btw.
> >
> >
> > Not really my concern. I'm interested in high-performance containers,
> not the C++ abstract machine.
>
> Then you're in the wrong forum here.
>
Not really. I think the C++ abstract machine is based on some fairly old
understandings which don't hold true anymore ie. that time complexity is
strongly relevant in some capacity (latency, overall performance) which it
isn't outside of large N or large sizeof(N). That being the case, it's not
something I should overly concern myself with, as the computers it was
designed for no longer exist.
My understanding is that std::colony does not provide random-access
> iterators.
> Only those have relational operators, though, in the general scheme of
> iterator operations [random.access.iterators].
>
> There is no specification that the iterators std::colony provides have
> relational operators.
>
There has been no pushback so far from colony having bidirectional
iterators with additional relational operators. The iterators have always
been specified as bidirectional. If you have a rational objection I'd be
interested in hearing it.
> > Okay, you're trying to get me to move some prose to the tech spec and
> some to the design decisions. Is there a guide as to what prose belongs
> where?
>
> Anything explaining "why" should go to the "Design Decisions".
>
> Anything allowing to judge whether my independent (assume bad-faith)
> implementation
> of std::colony is conforming or not (given arbitrarily hostile template
> parameters
> and arbitrarily hostile sequences of operations) needs to into the
> specification.
>
> Example:
> As I said above, "cache-locality" (for instance) is nothing you can talk
> about
> in the C++ standard (because the abstract machine doesn't provide
> vocabulary
> for it), but you might want to talk about only allowing at most one memory
> allocation (observable via the Allocator template parameter) per "min"
> inserts or so.
>
Okay. Will contemplate that.
Thanks once again,
Matt
Received on 2020-05-05 19:03:04