std-proposals: Re: 回复： Delay the judgement for coroutine function after the instantiation of template entity.

From: Jason McKesson <jmckesson_at_[hidden]>
Date: Fri, 22 Jan 2021 00:49:54 -0500

It's difficult to respond to you when your post intermingles responses
to different posts that are making different points. If you don't want
to send multiple e-mails, then at least make it clear which parts of
the text are responding to which people. Don't jumble them up
together.

On Thu, Jan 21, 2021 at 10:57 PM chuanqi.xcq
<yedeng.yd_at_[hidden]> wrote:
>
> > Because a function should either be a coroutine or not be a coroutine.
>
> I agree with this statement. But a template function isn't a function.

When I said "a function", I meant "that code you wrote that has a
particular name." And don't rules-lawyer about "well actually template
arguments are part of its name;" you know what I'm talking about.

> > Being a coroutine *does* change these things. External code has to be
> written differently for the coroutine version than the non-coroutine
> version.
>
> This may be one side effects. But my thought is the template argument is part of the function declaration. And now I think it is not easy to know whether function is coroutine by the declaration.
>
> For example, the following code:
> ```
> SomeType func();
> ```
>
> Can we know whether func is coroutine or not from declaration? No, we need to see the implementation or the comment to decide.

You misunderstood my point. In one instantiation, you had a function
that returned an `int`; in its coroutine form, it returned a
`task<int>`. It doesn't much matter if the coroutine form is a "true"
coroutine or just something that returns a `task<int>`. What matters
is that the way the caller *uses* the function must change.

Broadly speaking, if you have a template function, instantiating it
with different parameters may change its return type, but it shouldn't
unexpectedly change the basic way you *interact* with that kind of
type. And I know there are functions in the standard library that
violates those rules (`any_cast` being the most prominent). But it's
not a thing we should encourage.

> > Let's say that we have a function X which is, by its nature, an
> asynchronous coroutine function. This means that X has to schedule its
> resumption based on some external asynchronous process, like doing
> file/networking IO, etc. Doing this is *why* you made the function a
> coroutine; it is the nature of X that it waits on something
> asynchronously. And let's say that we have some function Y which gets
> called by X.
>
> The picture I see is that X is a coroutine and Y is a caller of X which need to do something only after X made his job. So Y should be a coroutine too, isn't it?

You got that backwards; Y "gets called by X". And no, the caller of X
doesn't need to be a coroutine *either*. At some point, every
coroutine has to be called by some function that is not a coroutine.

> > I was looking for something more specific in terms of the actual code,
> not the details of how the compiler generated non-optimal assembly.
> And specifically, I'm trying to understand the *meaning* of the code,
> not just a bunch of no-name functions that call each other. I want to
> know the details of what you're trying to do that led to the cases of
> both good performance and bad performance.
>
> Because of business secrets, I can't show the original codes and I also think it isn't necessary. Here is a more detailed example I think is enough to discuss:
> ```
> auto ReaderByPrefix(const std::string& prefixKey, const std::vector<std::string>& suffixKeys,
> CacheType cache, MemoryPoolPool* pool, MetricsCollector* metricsCollector) {
> assert(pool);
> KeyType keyHash(0);
> if (!GetKeyHash(prefixKey, keyHash))
> {
> if (metricsCollector)
> metricsCollector->EndQuery(KeyHash, "Not find Hash for PrefixKey: ", prefixKey);
> co_return Iterator(pool);
> }
> auto Res = co_await ReaderByKey(keyHash, suffixKeys, cache, pool, metricsCollector);
> tryInsertToCache(Res, cache);
> co_return Res;
> }
>
> auto ReaderByKey(KeyType keyHash, const std::vector<std::string>& suffixKeys, CacheType cache,
> MemoryPoolPool* pool, MetricsCollector* metricsCollector)
> {
> assert(pool);
> std::vector<uint64_t> skeyHashs;
> if (!GetSKeyHashVec(suffixKeys, skeyHashs))
> {
> if (metricsCollector)
> metricsCollector->EndQuery(KeyHash, "Not find SKeyHash for vec for hash: ", keyHash);
> co_return Iterator(pool);
> }
> auto Res = co_await LookupVecs(keyHash, std::move(skeyHashs), cache, pool, metricsCollector);
> tryInsertToCache(Res, cache);
> co_return Res;
> }
>
> auto LookupVecs(KeyType keyHash, std::vector<uint64_t> skeyHashs, CacheType cache,
> MemoryPoolPool* pool, MetricsCollector* metricsCollector) {
> auto Res = co_await SearchInCache(cache, keyHash, skeyHashs, pool, metricsCollector);
> if (Res) {
> metricsCollector->record(Res);
> co_return Res;
> }
> auto Reader = pool->getReader();
> auto SearchType = keyHash.getType();
> switch (SearchType) {
> case ReaderType1:
> // ReaderType1::LookupImpl maybe blocking
> auto Res = co_await static_cast<ReaderType1*>(Reader)->LookupImpl(keyHash, skeyHashs, pool, metricsCollector);
> tryInsertToCache(Res, cache);
> break;
> case ReaderType2:
> // and so on;
> break;
> default: {
> // ...
> }
> }
>
> if (metricsCollector)
> metricsCollector->EndQuery(Res, "End of query");
> co_return Iterator(pool);
> }
> ```
>
> Our project is a storage and query library. So there is a lot of interfaces which would be called by the user from the upper layer. Once a user starts a query, we need to made the query first and give the result to the user after that.

So, let me break down my understanding of your code from a quick
inspection. You have two basic asynchronous operations:
`SearchInCache` (whose asychronous nature I will assume is not due to
externally-defined code), and `LookupImpl`, which is user-provided
code that may or may not actually do asynchronous stuff. These are the
two terminal async operations; whenever this call graph suspends, it
will ultimately be because of one of those operations.

The source of the suspensions is deep in the call graph. There is a
very great deal of code between the source of the suspensions and the
actual root (the last caller that *isn't* a coroutine). And most of
the code between these two points *does not care* if suspending
happens or not.

All of this adds up to a textbook example of when to use stackful
coroutines. They can suspend through *anything*; none of the code
between the source and the receiver needs to know they are in a
coroutine.

So what we come down to is this: you want this feature so that you can
(ab)use stackless coroutines in a scenario that is almost tailor-made
for stackful coroutines. And stackful coroutines would almost
certainly alleviate your performance problems in less asynchronous
cases, since each function in the graph won't be its own heap
allocation.

So I would say that this is not a good motivating case for the change
to the standard, since you're only encountering this problem because
you're writing your code wrong.

Received on 2021-01-21 23:50:12