Jason,

Thanks for the feedback! Sorry for the day delay. It took a little time to write a response.

First, if the standard wants to define specific algorithms -- that's fine. Some use cases may be specific enough such that it needs to be defined down to the algorithm. I do not want to suggest something that takes away something good from stakeholders.

That said, the argument against goal-based allocators is that some users exist who wouldn't use anything but std::allocator, and some users exist who want to know exactly what the implementation is doing. However, it does not say that no users exist between those two extremes (let alone how many) and it does not say that well-described goals cannot exist that serve the middle group, as well maybe even some of the two extremes in some situations. (Even if they made the best-performing allocators for their main problem... maybe they'll use a debug allocator once or twice? Maybe they'll use one of our allocators when they encounter performance issues with secondary problems? etc.) The std::async example is one that shows poorly-described goals can exist, and that's fair -- but that's possible any time you define a specification. I'm more curious how much value can be derived from goal-based allocators, and what market will it serve.

How big is that group?
How easily can they be served?
What are their goals?
How much effort will it take (in development and education) before new behaviour becomes commonplace -- maybe even a default?
What do they want that behaviour to even be?
What do platforms want that behaviour to be?
Is there any long-term benefits to platforms?
Is there any long-term benefits to the language?

And yes, if we define use cases terribly, then this could be awful... and it's not entirely clear what awful is without thought.

What is being too vague?
What is being too explicit?
What aligns with other initiatives (such as performance, tooling, etc.)?
What needs to happen before it's a pain?

I am curious to see the range in what platforms want versus the range in what developers want. I know from myself that I was frustrated with being stuck with std::allocator for cases that I knew very well could be done better based on how the data was being accessed. I didn't want to import a new third-party dependency... and I didn't want to go through making it myself.

However, I did end up going through making it myself, although it will still need a bit of testing and performance tuning before I trust it enough to put it in production somewhere. That's not an effort that most users should need to go through. It's also holding benefits hostage from those who could very much use it.

But, before we go about solving the problems, we need to define what they are... which means we need to know what goals actual users and platforms have.

Thanks again for the thoughtful remarks!

On 11/5/2020 11:53 PM, Jason McKesson via Std-Proposals wrote:

On Thu, Nov 5, 2020 at 6:17 PM Scott Michaud via Std-Proposals
<std-proposals@lists.isocpp.org> wrote:

Arthur,

Wow thanks for the code review!

The goal when I was designing it was for per-thread temporaries in a thread pool job system. I envisioned it as a sort of expansion pack for a thread's stack. Elements that would persist longer than a blocking function call are not designed for that allocator, so I didn't consider moving from thread to thread.

This tangentially hits the core point, though.

My proposal was suggesting that memory allocation should be goal-centric.

Would enumerating goals be too focused to the point that users wouldn't care?
Do the actual goals in real-world memory allocation overlap too much?
Would going down to that level of use case not provide enough benefits to matter?

My thought is that this would be easy to teach and flexible to implement, especially if a vendor wants to try something way outside-the-box, like the "skipping main RAM and mapping directly to a cache" example.

If we say "have a stack-based allocator" then the implementation's hands are tied to an algorithm. If we say "have an allocator that's fast for locals" then they can take it and run as far as they want.

But again that's why I'm requesting feedback. I might be miles away from what typical stakeholders care about.

Thanks again!

Your notion of "goals" rather than "algorithms" feels a bit like
`std::async`, but worse. Here's what I mean.

If you launch an `async` task with `launch::deferred`, you get a
specific algorithm: the task executes on the thread that gets the
future value. If you launch an `async` task with the `launch::async`
policy, you also get a specific algorithm: a new thread is created
(along with all side-effects thereof) in which the task will be
executed.

But if you specify both launch policies, then you don't get an
algorithm; you get a goal of sorts. That goal being "be asynchronous".
Maybe the implementation uses a thread-pool for these, or maybe it
just launches a whole new thread for each one. Or something inbetween;
it's all "quality of implementation".

But that's the problem: the implementation details actually *matter*.
If the implementation is creating individual threads for each task,
then you can't use `std::async` in a scenario where you're launching
thousands of short tasks every second.

Of course, there are users for whom these details just don't matter.
Maybe they're not launching thousands of tasks per second. Maybe they
just need to do some work off-thread and need a way to get that data
back, so they're not picky. Etc.

But even so, the variance of implementations of
`launch::async|deferred` `std::async` calls are never actually
advantageous to the user. Having real control over what's going on may
not be necessary for every user, but lacking said control is never
helpful either. The algorithms being employed matter,

Your suggestion is similar to this. You want to transition from a
model where a user specifies the mechanics of an allocator and moves
to one where it expresses the intent of the allocator's usage.

There's one problem though. In the `std::async` case, the "goal" form
of `async` was still useful to some users because they're not picky
about the performance characteristics of that form of task
parallelism. Optimization isn't important to those use cases, so
letting the implementation choose the algorithm is adequate.

But in your case, that never actually happens.

If I need to do some special memory management shenanigans, that is
almost certainly because of a performance issue. That is, I cannot
afford to heap-allocate new storage within the context where I'm doing
this. So from my perspective, it is *really important* that I know
what algorithm my allocator is using.

Or at least, what algorithm my allocator is *not* using.

The concept "local allocator" tells me nothing useful. If I don't know
that the allocator will never heap-allocate (after creation), then I
cannot use it where I can't allow heap allocation. You might be able
to put some kind of restriction on implementers, similar to how
standard algorithms have big-O requirements. But that just restricts
which algorithms can be used, to the point where there's basically
only one (significant) way to implement most algorithms.

If I'm in a performance-critical scenario, and there's one specific
algorithm that I know will solve that problem, why would I want a
"goal" instead of the algorithm that I know will solve my problem? And
if I'm not in a performance-critical scenario... why wouldn't I just
use the heap? Using non-`std::allocator`s is troublesome; I'd only go
through that trouble if I need to. And if I need to, then I *need* a
specific algorithm with known characteristics.