In C++23, vector got a range constructor which can be used among with
ranges::to to construct a vector from a range.
If the source range is a random-access range, vector constructor can
pre-allocate enough storage before traversing a range. If the source range is
an input range, constructor is left with making emplace_back.
However, things get interesting for forward range. For example, in
current implementation of MSVC STL, they treat forward ranges as random-access
ranges in the constructor (the same is true for their vector::vector(It begin,
It end) constructor). I didn’t check, but other STL libraries probably do the
same, or they could go with input range way.
There are two variants here: we can either call ranges::distance,
pre-allocate and then assign elements (as MSVC STL currently does),
effectively traversing range twice or we can process the forward range as an
input range.
It is impossible for the constructor to tell if one variant or another
would be better, so the choice of double-traversal looks good in general case.
However, there might be a lot of cases when range traversal is very expensive
and only caller has this knowledge. Even worse, performance gains or loses may
vary on a case-by-case basis.
I have two suggestions and I believe both of them are worth adding to the
library, with the first one called to solve this particular problem, while the
second one potentially having wider application:
1. Propose a new vector range constructor:
template<typename Range>
vector::vector(from_range_t, Range &&range, size_t
estimated_number_elements);
This constructor calls reserve(estimated_number_elements) and then
traverses the range only once, calling emplace_back. This would potentially
trade more allocated memory for speed.
ranges::to is already capable of passing additional arguments to
constructors:
auto d = source_range | views::filter(/* expensive filter here */) |
ranges::to<std::vector>(estimated_number_of_elements);
2. Add new view adaptor views::as_input (similar to C++23
views::as_rvalue and as_const) that “converts” the range to the input
range.
Sincerely,
Alexander
Bessonov
HHD Software Ltd.