ISOCPP std-proposals List: Re: [std-proposals] Do not accept std::generator, please.

From: Lewis Baker <lewissbaker_at_[hidden]>
Date: Sun, 4 Sep 2022 10:37:00 +0930

On Sat, Sep 3, 2022 at 6:19 PM Nikl Kelbon via Std-Proposals <
std-proposals_at_[hidden]> wrote:

> std::generator has huge problems.
> I will briefly list them:
> 1. The generator must not depend on the allocator (only the promise_type
> depends), but std::generator has a template argument for alloc, in
> particular, this prohibits adding 2 generators that allocated memory
> differently into a container. It is possible to implement this without type
> erasure.
> Link to example implementation:
>
> https://github.com/kelbon/dd_generator/blob/4053fcff456c885d9330708d2e67f103c4212319/include/dd_generator.hpp#L264
>

The design of std::generator allows an *optional* allocator type to be
specified.

If you do not specify it then the allocator is determined solely by the
arguments.
If there is a std::allocator_arg + allocator object pair passed as the
first two arguments of the coroutine then it will use that.
Otherwise, it will fall back to using std::allocator by default.
In this way it is possible to create a container of std::generator objects
that have different allocator types by just not specifying the allocator in
the template argument.

If the template argument is specified then either:
- the allocator is stateless in which case it uses a default-constructed
allocator and avoids the user needing to explictly specify the allocator
- the allocator is stateful and the caller still needs to pass an instance
of the allocator using std::allocator_arg

The main motivation for allowing the allocator as a template argument to
std::generator is to allow the stateless-allocator use-case to be more
ergonomic.
e.g. so we can define an alias

template<typename Ref, typename Val = remove_cvref_t<Ref>>
using my_generator = std:::generator<Ref, Val, my_stateless_allocator>;

And then can just use my_generator throughout my program and it will just
use the custom allocator without having to worry about passing
std::allocator_arg to every coroutine function.

There is no additional type-erasure necessary for supporting these
use-cases.
We can just leverage the inherent type-erasure of coroutines in general to
handle this.
e.g. see
https://github.com/lewissbaker/generator/blob/main/include/__generator.hpp#L277-L330

2. Overhead for ueslss default type erasure(default behavior) and stack in
> generator (used only for element_of, which is basically for(auto&& v : gen)
> co_yield v;
> Its just uselss
> You pay for what you dont use.
>

What "useless default type erasure" are you referring to here?
There is no extra type-erasure needed for allocators.
Coroutines are themselves inherently type-erased (which we can take
advantage of).

There is indeed some overhead for supporting the recursive generator
use-case.
Current compilers generally struggle to inline/optimise operator++() as
this necessarily needs an extra level of indirection to be able to resume
different coroutines depending on what is the current leaf coroutine.

The recursive generator is more efficient for recursive use-cases but less
efficient for non-recursive use-cases.
Supporting 'co_yield elements_of(x)' is not possible without the recursive
support.

We discussed whether it would be better to add a non-recursive generator or
a recursive generator.
I think the consensus was that it was much harder to write your own
recursive generator and so that had more value to exist in the standard
library.
We can add another non-recursive generator type in future if compilers are
unable to optimise the recursive generator sufficiently.

> 3. It is undefined behavior to call begin for std::generator twice. But
> iteraing not all values in one loop is a very logical action and expected
> behavior - just keep generating values
> Besides, such an error is extremely non-obvious when using ranges or range
> based for loop.
>

This is a typical restriction provided by many other input_range
implementations.

Would you expect begin() to advance the iteration to the next element?
Or would you expect to be able to call begin() multiple times and each time
just get a copy of the iterator pointing to the same element?
Supporting either of these would potentially affect performance - either
requiring more state to be held in the generator to keep track of whether
this was the first call to begin() or not, or require an extra indirection
when starting the coroutine for the first time (since a subsequent call to
begin() might need to resume a different coroutine than the top-level
coroutine).

> 4. interface that is very obscure to the user and requires knowledge of
> implementation to write efficient code
> No one will write generator<const T&> foo(); without knowing why it is
> needed and why it is better for perfmornace
>

Can you please expand on what part of this you think is "obscure"?
The user needs to make a choice about whether they want to yield a sequence
of lvalues, const-lvalues, rvalues.
This will affect what kinds of values they can pass to co_yield and whether
they incur extra copies/etc.

> But there are way to write generator without such problems
> https://github.com/kelbon/dd_generator
>

Line 43
<https://github.com/kelbon/dd_generator/blob/9e33ee704444cc7f8d36f67c7358c9d19e770a0d/include/dd_generator.hpp#L43>:
You are not handling alignment requirements of the allocator object here.
Need to pad to an address that is a multiple of alignof(Alloc).
Line 32,48: You also need to check for
std::allocator_traits<Alloc>::is_always_equal

The need for any_coroutine_handle (which stores the coroutine_handle and
the promise pointer) and reinterpret_cast to get back to the 'value'
data-member seems like a fairly unsafe way of implementing this.
If you are storing the promise-pointer anyway, you could store a pointer to
the base-class instead.
e.g. see my generator<Ref, Val, use_allocator_arg> implementation
<https://github.com/lewissbaker/generator/blob/e7c6c1c3009f0e9cf32add465b16f6295bfb9ac4/include/__generator.hpp#L789>

Note that we cannot have coroutine_handle<void> return a pointer to the
promise with the currently implemented coroutine ABIs because the offset
between the frame pointer and the promise address is not necessarily a
constant - there may be padding added to align a promise type with
alignment >= 2*sizeof(void*). We would either need to change the ABI so
that the padding was placed before the function-pointers, or we'd need to
store an extra field to record the offset of the promise so we can compute
its address correctly.

I would also worry about the void* obtained from the
coroutine_handle<void>::promise() being used in an unsafe way
- it could be confused with address()/from_address()
- you need to use reinterpret_cast<> which is difficult to get right
without invoking UB
These are minor concerns, however.

- Lewis.

Received on 2022-09-04 01:07:14