C++ Logo

std-proposals

Advanced search

Re: Cache Alloc - 1.3x to 13.4x faster

From: Phil Bouchard <boost_at_[hidden]>
Date: Mon, 26 Jul 2021 03:06:29 -0400
Ok I integrated the now very generic allocator into an std::list and now
I have a complete benchmark available here:

https://github.com/philippeb8/cache_alloc/blob/main/README.md

Note that the initialization and destruction of the container are
excluded from the benchmark because they should be global / static
containers anyways if you really require speed.


-- 
*Phil Bouchard*
Founder & CTO
C.: (819) 328-4743
Fornux Logo <http://www.fornux.com>
On 7/25/21 1:16 PM, Phil Bouchard via Std-Proposals wrote:
> Yeah thanks. I'll come back with a better integration and benchmarks 
> later tonight or tomorrow.
>
> That and a hash map upgrade should definitely help the standard 
> library because C++ is mostly used for low-latency algorithms.
>
> *--*
> *
> *
> *
> Phil Bouchard*
> Founder
> C.: (819) 328-4743 <tel:(819)%20328-4743>
>
> Fornux Logo <http://www.fornux.com/>
>
>> On Jul 25, 2021, at 1:05 PM, Scott Michaud <scott_at_[hidden]> wrote:
>>
>> 
>>
>> Yeah, a thread's callstack is limited in size... usually in the 1 to 
>> 10 megabyte range.
>>
>> That said, if you want computation speed, then that's the part that 
>> will probably be hottest in cache.
>>
>> If that's too little memory for your use case, then you can make your 
>> own stack in the heap of whatever size that you want.
>>
>> You can also combine custom allocators (PMR or otherwise) with an 
>> object pool (if your resource is constructor and/or destructor 
>> heavy). I don't think the Standard Library has anything for that, 
>> though. I'd assume it would work something like a unique_ptr that, 
>> instead of freeing, would just return the object to a last-in 
>> first-out list. Not sure how many objects would see a speedup from 
>> that versus just constructing and destructing, though.
>>
>>
>> On 7/25/2021 12:52 PM, Phil Bouchard via Std-Proposals wrote:
>>> The container I'm looking for has to handle large amount of data 
>>> extremely fast. Stack allocations are fast but limited in size AFAIK.
>>>
>>> *--*
>>> *
>>> *
>>> *
>>> Phil Bouchard*
>>> Founder
>>> C.: (819) 328-4743 <tel:(819)%20328-4743>
>>>
>>> Fornux Logo <http://www.fornux.com/>
>>>
>>>> On Jul 25, 2021, at 11:39 AM, Marcin Jaczewski via Std-Proposals 
>>>> <std-proposals_at_[hidden]> wrote:
>>>>
>>>> 
>>>> Do you consider 
>>>> https://en.cppreference.com/w/cpp/memory/polymorphic_allocator 
>>>> <https://en.cppreference.com/w/cpp/memory/polymorphic_allocator> ?
>>>> It causes some overhead per object but allows removing allocation 
>>>> altogether, if your container has a limited lifetime.
>>>> Simply put, stack allocation is a couple of orders of magnitude 
>>>> faster than heap allocation.
>>>> It have its limits like life time and allowed size, but with PMR 
>>>> you can chain allocators and allow fallback allocations
>>>> on heap when you exhaust the local buffer.
>>>>
>>>>
>>>> niedz., 25 lip 2021 o 11:16 Phil Bouchard via Std-Proposals 
>>>> <std-proposals_at_[hidden] 
>>>> <mailto:std-proposals_at_[hidden]>> napisał(a):
>>>>
>>>>     I am currently abstracting / defining things that are not yet
>>>>     defined so I'll have to rewrite the example I provided later
>>>>     today.
>>>>
>>>>     But what I'm trying to abstract really is an allocator cache
>>>>     class optimized for LIFO & FIFO container types. It could be
>>>>     used for an container types but it would be optimized for
>>>>     queues and stacks.
>>>>
>>>>     *--*
>>>>     *
>>>>     *
>>>>     *
>>>>     Phil Bouchard*
>>>>     Founder
>>>>     C.: (819) 328-4743 <tel:(819)%20328-4743>
>>>>
>>>>     Fornux Logo <http://www.fornux.com/>
>>>>
>>>>>     On Jul 25, 2021, at 12:24 AM, Jason McKesson via Std-Proposals
>>>>>     <std-proposals_at_[hidden]
>>>>>     <mailto:std-proposals_at_[hidden]>> wrote:
>>>>>
>>>>>     On Sat, Jul 24, 2021 at 7:01 PM Phil Bouchard via Std-Proposals
>>>>>     <std-proposals_at_[hidden]
>>>>>     <mailto:std-proposals_at_[hidden]>> wrote:
>>>>>>
>>>>>>     - Front / back (non middle) deletions and insertions will
>>>>>>     guarantee the allocations are contiguous (will keep block
>>>>>>     sizes together, organized and without holes in between);
>>>>>>     - The size of the pages will be directly proportional to the
>>>>>>     usage frequency so the greater the demand is, the bigger the
>>>>>>     memory pages will be thus less overall page allocations);
>>>>>>     - What I presented is perfect for LIFO container types, a
>>>>>>     similar specialization can be done with FIFO and other
>>>>>>     container types will use a generic specialization.
>>>>>>
>>>>>>     That will force the developers to think twice before using a
>>>>>>     specific container type of they really want speed.
>>>>>>
>>>>>
>>>>>     Do you think developers interested in performance *don't*
>>>>>     think about
>>>>>     allocation patterns of their containers? Developers interested
>>>>>     in the
>>>>>     absolute fastest performance for containers generally don't use
>>>>>     standard library containers anyway, since they will want
>>>>>     control over
>>>>>     things that standard library containers don't give control over.
>>>>>
>>>>>     Also, what is a "LIFO container" and how are they different
>>>>>     from "FIFO
>>>>>     containers"? I know what FIFO and LIFO mean, but I don't know
>>>>>     how they
>>>>>     apply to *containers*. FIFO (ie: a queue) is not a type of
>>>>>     container;
>>>>>     it's an *access pattern* for a container. LIFO is a stack, but
>>>>>     again
>>>>>     that's a way of accessing a container. Now yes, you can write a
>>>>>     stack/queue class that exposes only the stack/queue interface.
>>>>>     And you
>>>>>     could therefore call that a stack/queue container. But the actual
>>>>>     container structure underlying that interface is ultimately
>>>>>     what we're
>>>>>     talking about.
>>>>>
>>>>>     At the end of the day, what does `some_allocator_type<T,
>>>>>     std::vector>`
>>>>>     really know about how you intend to use that `vector`? Does it
>>>>>     know
>>>>>     that you're going to use a LIFO access pattern? Because there
>>>>>     are ways
>>>>>     of using `vector` that don't involve being a stack. And if you are
>>>>>     genuinely interested in LIFO access patterns, there are more
>>>>>     efficient
>>>>>     data structures than using `vector`. And that's about more
>>>>>     than just
>>>>>     how you allocate memory; it has to involve how you *use* that
>>>>>     memory.
>>>>>     -- 
>>>>>     Std-Proposals mailing list
>>>>>     Std-Proposals_at_[hidden]
>>>>>     <mailto:Std-Proposals_at_[hidden]>
>>>>>     https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>>     <https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals>
>>>>     -- 
>>>>     Std-Proposals mailing list
>>>>     Std-Proposals_at_[hidden]
>>>>     <mailto:Std-Proposals_at_[hidden]>
>>>>     https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>     <https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals>
>>>>
>>>> -- 
>>>> Std-Proposals mailing list
>>>> Std-Proposals_at_[hidden]
>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>
>

Received on 2021-07-26 02:06:41