On Sat, Sep 3, 2022 at 6:19 PM Nikl Kelbon via Std-Proposals <std-proposals@lists.isocpp.org> wrote:

std::generator has huge problems.
I will briefly list them:
1. The generator must not depend on the allocator (only the promise_type depends), but std::generator has a template argument for alloc, in particular, this prohibits adding 2 generators that allocated memory differently into a container. It is possible to implement this without type erasure.
Link to example implementation:
https://github.com/kelbon/dd_generator/blob/4053fcff456c885d9330708d2e67f103c4212319/include/dd_generator.hpp#L264

The design of std::generator allows an optional allocator type to be specified.

If you do not specify it then the allocator is determined solely by the arguments.

If there is a std::allocator_arg + allocator object pair passed as the first two arguments of the coroutine then it will use that.

Otherwise, it will fall back to using std::allocator by default.

In this way it is possible to create a container of std::generator objects that have different allocator types by just not specifying the allocator in the template argument.

If the template argument is specified then either:

- the allocator is stateless in which case it uses a default-constructed allocator and avoids the user needing to explictly specify the allocator

- the allocator is stateful and the caller still needs to pass an instance of the allocator using std::allocator_arg

The main motivation for allowing the allocator as a template argument to std::generator is to allow the stateless-allocator use-case to be more ergonomic.

e.g. so we can define an alias

template<typename Ref, typename Val = remove_cvref_t<Ref>>

using my_generator = std:::generator<Ref, Val, my_stateless_allocator>;

And then can just use my_generator throughout my program and it will just use the custom allocator without having to worry about passing std::allocator_arg to every coroutine function.

There is no additional type-erasure necessary for supporting these use-cases.

We can just leverage the inherent type-erasure of coroutines in general to handle this.

e.g. see https://github.com/lewissbaker/generator/blob/main/include/__generator.hpp#L277-L330

2. Overhead for ueslss default type erasure(default behavior) and stack in generator (used only for element_of, which is basically for(auto&& v : gen) co_yield v;
Its just uselss
You pay for what you dont use.

What "useless default type erasure" are you referring to here?

There is no extra type-erasure needed for allocators.

Coroutines are themselves inherently type-erased (which we can take advantage of).

There is indeed some overhead for supporting the recursive generator use-case.

Current compilers generally struggle to inline/optimise operator++() as this necessarily needs an extra level of indirection to be able to resume different coroutines depending on what is the current leaf coroutine.

The recursive generator is more efficient for recursive use-cases but less efficient for non-recursive use-cases.

Supporting 'co_yield elements_of(x)' is not possible without the recursive support.

We discussed whether it would be better to add a non-recursive generator or a recursive generator.

I think the consensus was that it was much harder to write your own recursive generator and so that had more value to exist in the standard library.

We can add another non-recursive generator type in future if compilers are unable to optimise the recursive generator sufficiently.

3. It is undefined behavior to call begin for std::generator twice. But iteraing not all values in one loop is a very logical action and expected behavior - just keep generating values
Besides, such an error is extremely non-obvious when using ranges or range based for loop.

This is a typical restriction provided by many other input_range implementations.

Would you expect begin() to advance the iteration to the next element?

Or would you expect to be able to call begin() multiple times and each time just get a copy of the iterator pointing to the same element?

Supporting either of these would potentially affect performance - either requiring more state to be held in the generator to keep track of whether this was the first call to begin() or not, or require an extra indirection when starting the coroutine for the first time (since a subsequent call to begin() might need to resume a different coroutine than the top-level coroutine).

4. interface that is very obscure to the user and requires knowledge of implementation to write efficient code
No one will write generator<const T&> foo(); without knowing why it is needed and why it is better for perfmornace

Can you please expand on what part of this you think is "obscure"?

The user needs to make a choice about whether they want to yield a sequence of lvalues, const-lvalues, rvalues.

This will affect what kinds of values they can pass to co_yield and whether they incur extra copies/etc.

But there are way to write generator without such problems
https://github.com/kelbon/dd_generator

Line 43: You are not handling alignment requirements of the allocator object here. Need to pad to an address that is a multiple of alignof(Alloc).

Line 32,48: You also need to check for std::allocator_traits<Alloc>::is_always_equal

The need for any_coroutine_handle (which stores the coroutine_handle and the promise pointer) and reinterpret_cast to get back to the 'value' data-member seems like a fairly unsafe way of implementing this.

If you are storing the promise-pointer anyway, you could store a pointer to the base-class instead.

e.g. see my generator<Ref, Val, use_allocator_arg> implementation

Note that we cannot have coroutine_handle<void> return a pointer to the promise with the currently implemented coroutine ABIs because the offset between the frame pointer and the promise address is not necessarily a constant - there may be padding added to align a promise type with alignment >= 2*sizeof(void*). We would either need to change the ABI so that the padding was placed before the function-pointers, or we'd need to store an extra field to record the offset of the promise so we can compute its address correctly.

I would also worry about the void* obtained from the coroutine_handle<void>::promise() being used in an unsafe way

- it could be confused with address()/from_address()

- you need to use reinterpret_cast<> which is difficult to get right without invoking UB

These are minor concerns, however.

- Lewis.