sg14: Re: [SG14] sizeof_block_metadata function, colony/hive, advice wanted

From: Matt Bentley <mattreecebentley_at_[hidden]>
Date: Fri, 26 Nov 2021 16:51:37 +1300

ps. As draft ideas for why this function may be useful, consider the
following scenarios:

1. The user knows their initial amount of elements is very small, say 4
elements, and will likely remain small, but could potentially grow into
hundreds of elements given the right scenario.

They want to specify a low minimum block capacity to avoid creating too
much excess memory waste for the default scenario. But how much memory
is the metadata using for that capacity? If it's higher than the memory
used by the number of elements, they might want to consider increasing
the minimum block capacity, as otherwise when the container expands they
may end up with more wasted memory from those initial small blocks
rather than less.

Inspecting potential block metadata memory usage for different minimum
capacities in this context could lead to smarter decisions here.

2. The user is doing a lot of random erasures over time, on a large
number of elements, which may lead to a large amount of wasted space (in
hive/colony) if the blocks are very large. Their inclination is to make
the blocks smaller, as they are not doing a lot of iteration over the
container so having larger blocks and hence greater iterative locality
doesn't affect performance much. Whereas if they make the blocks smaller
they are more likely to become empty and then can be freed to the OS.
But the smaller the blocks are, the smaller the ratio of element data to
block metadata becomes.

If they know the approximate number of elements they're working with,
and the approximate erasure rates, using the function in question they
can algorithmically decide which maximum block size will result in the
lowest memory usage overall during execution.

I guess the same stuff could potentially be done with a profiler, but
that requires more work.

On 26/11/2021 11:43 am, Matt Bentley wrote:
> I guess it's similar to memory(), which allows the user to enquire as to
> how much memory is being used by the container as a whole, given that
> implementations may vary significantly and the skipfield side of things
> can make guesses at the actual memory usage impossible.
>
> But you're right in the sense that this is really information one wants
> *before* specifying custom min/max block sizes. By default my
> implementation weighs the size of the block metadata, plus the
> sizeof(container), against the sizeof(element_type) to determine the
> most reasonable initial block capacity. It does so with the following
> macro:
>
> #define PLF_MIN_BLOCK_CAPACITY (sizeof(aligned_element_type) * 8 >
> (sizeof(plf::colony<element_type>) + sizeof(group)) * 2) ? 8 :
> (((sizeof(plf::colony<element_type>) + sizeof(group)) * 2) /
> sizeof(aligned_element_type))
>
> (group is the name of the block metadata struct).
>
> The main difference is that the new function would also include the
> amount of memory used by the skipfield (since this would also vary
> per-implementation).
>
> Someone could do something similar to the above code with this function
> if they wanted an initial block capacity that was responsive to the
> sizeof(element_type) and the memory consumed by metadata, but not the
> same as the implementation's default initial block capacity. Of course,
> this would probably involve multiple calls to the function to find the
> optimal starting block size.
>
> In addition, it can be a function which allows a user to probe in
> advance what different block sizes yield in terms of metadata memory
> usage, for any given implementation, outside of live code.
> M@
>
> On 26/11/2021 5:46 am, Ben Craig wrote:
>> If you can provide a short, motivational chunk of code that shows how
>> it would be used, then that would make it a lot easier to answer the
>> question about whether it's useful or not. I suspect the function may
>> be providing something that is easy for the implementation to provide,
>> but very hard for a user to use.
>>
>> I suspect that this is trying to help the user solve an optimization
>> problem, where they are trying to minimize the amount of total memory
>> used for a given range of element counts. If that's the case, then
>> write a short code snippet that either solves, or hints at the
>> solution to that problem.
>>
>>> -----Original Message-----
>>> From: SG14 <sg14-bounces_at_[hidden]> On Behalf Of Matt Bentley via
>>> SG14
>>> Sent: Wednesday, November 24, 2021 7:35 PM
>>> To: Low Latency:Game Dev/Financial/Trading/Simulation/Embedded Devices
>>> <sg14_at_[hidden]>
>>> Cc: Matt Bentley <mattreecebentley_at_[hidden]>
>>> Subject: [EXTERNAL] [SG14] sizeof_block_metadata function, colony/hive,
>>> advice wanted
>>>
>>> Hi all,
>>> thinking about including a
>>> size_type sizeof_block_metadata(const size_t block_size)
>>>
>>> function for colony/hive, as this would allow the user to ascertain the
>>> amount of memory used by a memory block's metadata in colony (including
>>> skipfield), given the number of elements stored in that memory block.
>>> This is less important for colony which is a fixed implementation, more
>>> important for std::hive which could potentially have many alternative
>>> implementations and these might not be visible to the user in text form.
>>>
>>> The idea was that if someone knows how much memory is used by each
>>> block outside of the actual memory used to store the elements, that
>>> might
>>> aid in terms of knowing how much cache space is used by the metadata and
>>> people might be able to make better decisions about the min/max
>>> capacities
>>> of memory blocks (if they are specifying these manually) on that basis.
>>>
>>> The function would throw an exception if the block_size specified was
>>> outside the minimum/maximum block capacity ranges.
>>>
>>> My main question to you is whether this function is worth including -
>>> as it
>>> adds yet another function to the codebase.
>>> Thanks,
>>> M@
>>>
>>>
>>> _______________________________________________
>>> SG14 mailing list
>>> SG14_at_[hidden]
>>> https://urldefense.com/v3/__https://lists.isocpp.org/mailman/listinfo.cgi/sg
>>>
>>> 14__;!!FbZ0ZwI3Qg!_EmqRo6oE_K3_0kspLt95nQkSwn6ReF6w24SgNk6c4-
>>> tKwgxy1B_QABcfeZ_$
>> .
>>

Received on 2021-11-25 21:51:42