Date: Fri, 11 Nov 2022 23:45:25 +0300
In C++23, vector got a range constructor which can be used among with ranges::to to construct a vector from a range.
If the source range is a random-access range, vector constructor can pre-allocate enough storage before traversing a range. If the source range is an input range, constructor is left with making emplace_back.
However, things get interesting for forward range. For example, in current implementation of MSVC STL, they treat forward ranges as random-access ranges in the constructor (the same is true for their vector::vector(It begin, It end) constructor). I didn’t check, but other STL libraries probably do the same, or they could go with input range way.
There are two variants here: we can either call ranges::distance, pre-allocate and then assign elements (as MSVC STL currently does), effectively traversing range twice or we can process the forward range as an input range.
It is impossible for the constructor to tell if one variant or another would be better, so the choice of double-traversal looks good in general case. However, there might be a lot of cases when range traversal is very expensive and only caller has this knowledge. Even worse, performance gains or loses may vary on a case-by-case basis.
I have two suggestions and I believe both of them are worth adding to the library, with the first one called to solve this particular problem, while the second one potentially having wider application:
1. Propose a new vector range constructor:
template<typename Range>
vector::vector(from_range_t, Range &&range, size_t estimated_number_elements);
This constructor calls reserve(estimated_number_elements) and then traverses the range only once, calling emplace_back. This would potentially trade more allocated memory for speed.
ranges::to is already capable of passing additional arguments to constructors:
auto d = source_range | views::filter(/* expensive filter here */) | ranges::to<std::vector>(estimated_number_of_elements);
2. Add new view adaptor views::as_input (similar to C++23 views::as_rvalue and as_const) that “converts” the range to the input range.
Sincerely,
Alexander Bessonov
HHD Software Ltd.
If the source range is a random-access range, vector constructor can pre-allocate enough storage before traversing a range. If the source range is an input range, constructor is left with making emplace_back.
However, things get interesting for forward range. For example, in current implementation of MSVC STL, they treat forward ranges as random-access ranges in the constructor (the same is true for their vector::vector(It begin, It end) constructor). I didn’t check, but other STL libraries probably do the same, or they could go with input range way.
There are two variants here: we can either call ranges::distance, pre-allocate and then assign elements (as MSVC STL currently does), effectively traversing range twice or we can process the forward range as an input range.
It is impossible for the constructor to tell if one variant or another would be better, so the choice of double-traversal looks good in general case. However, there might be a lot of cases when range traversal is very expensive and only caller has this knowledge. Even worse, performance gains or loses may vary on a case-by-case basis.
I have two suggestions and I believe both of them are worth adding to the library, with the first one called to solve this particular problem, while the second one potentially having wider application:
1. Propose a new vector range constructor:
template<typename Range>
vector::vector(from_range_t, Range &&range, size_t estimated_number_elements);
This constructor calls reserve(estimated_number_elements) and then traverses the range only once, calling emplace_back. This would potentially trade more allocated memory for speed.
ranges::to is already capable of passing additional arguments to constructors:
auto d = source_range | views::filter(/* expensive filter here */) | ranges::to<std::vector>(estimated_number_of_elements);
2. Add new view adaptor views::as_input (similar to C++23 views::as_rvalue and as_const) that “converts” the range to the input range.
Sincerely,
Alexander Bessonov
HHD Software Ltd.
Received on 2022-11-11 20:45:44