Date: Mon, 25 Apr 2022 20:23:37 -0300
Hi folks,
Creating unrelated objects contiguously in memory is a very useful
technique to implement some data structures and avoid needless allocations.
Take std::make_shared<T>(...) for example. It will execute one single
allocation to store the control block for the shared pointer and the "T"
instance.
Back when I first learned of it, I thought it was a nifty trick for a very
specialized case, but I started to see that pattern more and more.
Another instance of this "trick" is how "new T[sz]" is implemented for
non-trivially-destructible T. The compiler needs to know how many objects
to destroy in the call for "delete[] ptr", yet where this information is
stored is not obvious. Implementations will allocate extra memory to store
the size of the array behind the sequence of Ts, and retrieve it during
delete.
This technique sits in the "nifty trick" realm (in my opinion), because
there are many details for the average developer to be aware of when
applying it.
1 - The alignment each type requires. std::align helps but the demonstration
of usage in cppreference <https://en.cppreference.com/w/cpp/memory/align> is
scary with casts to char*, reinterpret casts and pointer math.
2 - Exception guarantees may be hard to get exactly right. You need to make
sure created objects are destroyed in the reverse order should any
exception be thrown during the creation of such objects.
These issues get prohibitively complex to solve during everyday development
as you mix in more than two types or accept arrays (like
std::make_shared<T[]>).
Finally, the purpose of this email is to float the idea on a standard
library facility to make this an every(other)day technique rather than a
trick.
For the sake of introducing some concreteness, let's start with this naive
API:
template<class... Ts>
std::tuple<std::pair<Ts*, Ts*>...>
make_contiguous_objects(initializer_for<Ts>...);
This function would return a tuple or begin/end pointers for each T in Ts
that were created on the heap, given some information on how to initialize
them.
So, for std::make_shared<T[]>(size_t n), a big part of the job could be
achieved through:
auto objs = std::make_contiguous_objects<ControlBlock, T>(/*number of
control blocks=*/ 1, /*number of Ts=*/ n);
Having implemented this myself in a more complex context
<https://github.com/brenoguim/flexclass> I know that "initializer_for" may
require a variety of information besides the number of objects to
initialize.
Allocators, initialization values, initialization through input iterators,
... There are many possibilities. Lastly, an utility to destroy&free this
would also come in handy.
So, am I on to something?
Thanks for your attention!
Breno Guimarães
Creating unrelated objects contiguously in memory is a very useful
technique to implement some data structures and avoid needless allocations.
Take std::make_shared<T>(...) for example. It will execute one single
allocation to store the control block for the shared pointer and the "T"
instance.
Back when I first learned of it, I thought it was a nifty trick for a very
specialized case, but I started to see that pattern more and more.
Another instance of this "trick" is how "new T[sz]" is implemented for
non-trivially-destructible T. The compiler needs to know how many objects
to destroy in the call for "delete[] ptr", yet where this information is
stored is not obvious. Implementations will allocate extra memory to store
the size of the array behind the sequence of Ts, and retrieve it during
delete.
This technique sits in the "nifty trick" realm (in my opinion), because
there are many details for the average developer to be aware of when
applying it.
1 - The alignment each type requires. std::align helps but the demonstration
of usage in cppreference <https://en.cppreference.com/w/cpp/memory/align> is
scary with casts to char*, reinterpret casts and pointer math.
2 - Exception guarantees may be hard to get exactly right. You need to make
sure created objects are destroyed in the reverse order should any
exception be thrown during the creation of such objects.
These issues get prohibitively complex to solve during everyday development
as you mix in more than two types or accept arrays (like
std::make_shared<T[]>).
Finally, the purpose of this email is to float the idea on a standard
library facility to make this an every(other)day technique rather than a
trick.
For the sake of introducing some concreteness, let's start with this naive
API:
template<class... Ts>
std::tuple<std::pair<Ts*, Ts*>...>
make_contiguous_objects(initializer_for<Ts>...);
This function would return a tuple or begin/end pointers for each T in Ts
that were created on the heap, given some information on how to initialize
them.
So, for std::make_shared<T[]>(size_t n), a big part of the job could be
achieved through:
auto objs = std::make_contiguous_objects<ControlBlock, T>(/*number of
control blocks=*/ 1, /*number of Ts=*/ n);
Having implemented this myself in a more complex context
<https://github.com/brenoguim/flexclass> I know that "initializer_for" may
require a variety of information besides the number of objects to
initialize.
Allocators, initialization values, initialization through input iterators,
... There are many possibilities. Lastly, an utility to destroy&free this
would also come in handy.
So, am I on to something?
Thanks for your attention!
Breno Guimarães
Received on 2022-04-25 23:23:48