C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Extension to runtime polymorphism proposed

From: Muneem <itfllow123_at_[hidden]>
Date: Sat, 4 Apr 2026 06:57:13 +0500
In standardese:
I want the abstract machine to render any of the values chosen at runtime
to be rendered in place regardless of their type.

On Sat, 4 Apr 2026, 6:54 am Muneem, <itfllow123_at_[hidden]> wrote:

> Think of intermediate code representation as the process where the c++
> compiler is officially over and the backend of the compiler takes over
> (llvms that then translate code to assembly).
>
> On Sat, 4 Apr 2026, 6:47 am Muneem, <itfllow123_at_[hidden]> wrote:
>
>> I really wish that you can advice me on how to present this problem
>> because it does seem like I may be incapable of putting it into words. Like
>> please do a counter proposal, I would really appreciate it.
>> Regards, Muneem.
>>
>> On Sat, 4 Apr 2026, 6:43 am Muneem, <itfllow123_at_[hidden]> wrote:
>>
>>> I am really really sorry if my response was no satisfactory.
>>> In short, all I want is to solve the problem where we can't tell the
>>> compiler to "render an object of any type in place at the intermediate code
>>> level". This problem is a problem because we have to do it our selves
>>> unlike in languages like GO. One this problem is solved then heterogenous
>>> lists would be possible.
>>> Regards, Muneem
>>>
>>> On Sat, 4 Apr 2026, 6:24 am Steve Weinrich via Std-Proposals, <
>>> std-proposals_at_[hidden]> wrote:
>>>
>>>> Hi Maneem,
>>>>
>>>> I had hoped by asking you to explain the problem you are trying to
>>>> solve, I might be able to help you describe things in a way that would make
>>>> it easier for you to describe the exact language feature you are proposing.
>>>>
>>>> Either am not smart enough to understand you (a good probability) or
>>>> you are incapable of describing what you seek in C++ language terms. Keep
>>>> in mind that the committee only deals on the language!
>>>>
>>>> I wish you all the best.
>>>>
>>>> Cheers,
>>>> Steve
>>>>
>>>> On Fri, Apr 3, 2026, 19:09 Muneem via Std-Proposals <
>>>> std-proposals_at_[hidden]> wrote:
>>>>
>>>>> Thank you for your response!
>>>>> what is the problem(the question you asked)?
>>>>> There are many possible answers to what your question might have
>>>>> meant(hope any of these answer you):
>>>>>
>>>>> 1.This problem before c++ existed and before it had this static type
>>>>> system could be quantified as branching overhead and verbosity, and why we
>>>>> needed better techniques that avoided it.
>>>>> This wilkipedia article explains the first time that branching was a
>>>>> problem(all credits to wilkipedia:
>>>>> https://en.wikipedia.org/wiki/Branch_table):
>>>>> Use of branch tables and other raw data encoding was common in the
>>>>> early days of computing when memory was expensive, CPUs were slower and
>>>>> compact data representation and efficient choice of alternatives were
>>>>> important. This fact is truer now more than ever. In particular the issue
>>>>> in branching: pointer redirection using any implementation like vtables or
>>>>> function pointers. Sometimes, branching may infact be faster, but again,
>>>>> constructs that give no context and intent to the compiler are hard for the
>>>>> compiler to decide what to do with. This is why c++ came and provides if
>>>>> statements, switch statements, ternary statements, and many more so that it
>>>>> can provide the best intermediate code representation possible. Each type
>>>>> of branch statements isn't just a new syntax but it makes a user write a
>>>>> certain way, and let's the compiler do optimizations before the compiler
>>>>> backend reads the code.
>>>>>
>>>>> 2. The problem is the lack of a construct for describing a language
>>>>> level construct for type erasure that can result in optimized intermediate
>>>>> representation of the code.
>>>>>
>>>>> 3.The problem is virtual functions don't do it, switch case statements
>>>>> don't do it, nothing does, manual type erasure code dosent so it. The "it"
>>>>> is type erasure patterns. Switch case (and others) statements fails
>>>>> completely if you don't write code for each object again and again.
>>>>>
>>>>> 4. The problem is verbosity of current branching/polymorphism
>>>>> techniques for type erasure. Not only that but you can't even overload a
>>>>> polymorphic function to return a different type based on an argument(unless
>>>>> the return type is a polymorphic class(known as "Return Type Relaxation" in
>>>>> C++ 4th by Bajrne Stroustrup" section 20.3.6)) in order to fix this problem
>>>>> by the visitor pattern or some other double dispatch pattern.
>>>>>
>>>>> 5. The problem is lack of clear expression of type erasure.
>>>>>
>>>>>
>>>>> 2. I don't want make an heterogeneous list in the traditional sense,
>>>>> but rather a list of const references of any type, so it isnt against c++
>>>>> philosophies, it's just trying to automate the process of manual type
>>>>> erasure and leave it to the compiler to produce optimal intermediate code
>>>>> representation based on the specific program, context of every
>>>>> subscripting, and the source of every subscripting operation. That is as
>>>>> c++ as one can get. It would not be c++ if I were to just ask for
>>>>> heterogeneous list that is completely up to the implementation, but rather
>>>>> I want type erasure for const lvalue references at a language
>>>>> level(optimized intermediate code representation). Think of it like
>>>>> templates, if it helps making it easy to reason. In fact, this is a form of
>>>>> "Return Type Relaxation"(in C++ 4th by Bajrne Stroustrup" section 20.3.6),
>>>>> but instead of pointers and references, but I want only want to use const
>>>>> references.
>>>>>
>>>>>
>>>>>
>>>>> On Sat, 4 Apr 2026, 5:22 am Steve Weinrich via Std-Proposals, <
>>>>> std-proposals_at_[hidden]> wrote:
>>>>>
>>>>>> Hi Muneem,
>>>>>>
>>>>>>
>>>>>>
>>>>>> I am not trying to be difficult, but the “problems” that you have
>>>>>> described are {potentially} the result of implementation choices and/or C++
>>>>>> limitations. What I am currently interested in is the problem that was
>>>>>> presented before making those choices.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Let me see if I can give an example. Someone says, I have a
>>>>>> problem. Every time I insert on object into std::vector, I have to re-sort
>>>>>> the vector. Obviously, they are using the wrong container. *But
>>>>>> even if they switch to a std::map, we don’t know if the data truly needs to
>>>>>> be sorted. Or how frequently it needs to be sorted.*
>>>>>>
>>>>>>
>>>>>>
>>>>>> You say, “You want to choose between three container objects but
>>>>>> they don't have different type.”
>>>>>>
>>>>>>
>>>>>>
>>>>>> That could look like:
>>>>>>
>>>>>>
>>>>>>
>>>>>> std::vector<int> alpha;
>>>>>>
>>>>>> std::vector<int> beta;
>>>>>>
>>>>>> std::vector<int> gamma;
>>>>>>
>>>>>>
>>>>>>
>>>>>> This immediately raises the question, “Why three vectors?”
>>>>>>
>>>>>>
>>>>>>
>>>>>> You say, “You want to choose between any three objects but they have
>>>>>> different types.”
>>>>>>
>>>>>>
>>>>>>
>>>>>> That could look like:
>>>>>>
>>>>>>
>>>>>>
>>>>>> class Alpha
>>>>>>
>>>>>> {
>>>>>>
>>>>>> public:
>>>>>>
>>>>>> int funcA ();
>>>>>>
>>>>>>
>>>>>>
>>>>>> private:
>>>>>>
>>>>>> // Some Alpha specific data
>>>>>>
>>>>>> };
>>>>>>
>>>>>>
>>>>>>
>>>>>> class Beta
>>>>>>
>>>>>> {
>>>>>>
>>>>>> public:
>>>>>>
>>>>>> int funcA ();
>>>>>>
>>>>>>
>>>>>>
>>>>>> private:
>>>>>>
>>>>>> // Some Beta specific data
>>>>>>
>>>>>> };
>>>>>>
>>>>>>
>>>>>>
>>>>>> class Gamma
>>>>>>
>>>>>> {
>>>>>>
>>>>>> public:
>>>>>>
>>>>>> int funcA ();
>>>>>>
>>>>>>
>>>>>>
>>>>>> private:
>>>>>>
>>>>>> // Some Gamma specific data
>>>>>>
>>>>>> };
>>>>>>
>>>>>>
>>>>>>
>>>>>> That immediately raises the question, “Will virtual functions work?”
>>>>>>
>>>>>>
>>>>>>
>>>>>> So I ask again, what is the problem (not the solution)?
>>>>>>
>>>>>>
>>>>>>
>>>>>> Cheers,
>>>>>> Steve
>>>>>>
>>>>>> *From:* Std-Proposals <std-proposals-bounces_at_[hidden]> *On
>>>>>> Behalf Of *Muneem via Std-Proposals
>>>>>> *Sent:* Friday, April 3, 2026 6:03 PM
>>>>>> *To:* std-proposals_at_[hidden]
>>>>>> *Cc:* Muneem <itfllow123_at_[hidden]>
>>>>>> *Subject:* Re: [std-proposals] Fwd: Extension to runtime
>>>>>> polymorphism proposed
>>>>>>
>>>>>>
>>>>>>
>>>>>> An extended reason to why this class of problems described in the
>>>>>> previous emails exists is because the current constructs dosent give enough
>>>>>> context to intermediate code generation, and solely rely on llvms or other
>>>>>> backends to be advanced enough, which for languages don't make sense
>>>>>> because language are meant to be less verbose and more explicit than a
>>>>>> compiler backend code generation tools.
>>>>>>
>>>>>>
>>>>>>
>>>>>> (Really Sorry for sending two emails at one)
>>>>>>
>>>>>>
>>>>>>
>>>>>> Regards, Muneem.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sat, 4 Apr 2026, 4:54 am Muneem, <itfllow123_at_[hidden]> wrote:
>>>>>>
>>>>>> The actual class of problems is code repitition and obscurement of
>>>>>> intent:
>>>>>>
>>>>>> 1.Say you want to choose between three container objects but they
>>>>>> don't have different type.
>>>>>>
>>>>>> 2. Say you want to choose between any three objects but they have
>>>>>> different types.
>>>>>>
>>>>>>
>>>>>>
>>>>>> The current fix for both, to allocate those objects in a
>>>>>> std::vector/array of std::variant element type or to use switch statements.
>>>>>> All of these obscure intent making it hard to optimize code in intermediate
>>>>>> code generation. This is actually more common than we think, infact, we can
>>>>>> destroy 99% of germs(branch statements) with function objects that can be
>>>>>> indexed.
>>>>>>
>>>>>>
>>>>>>
>>>>>> In short the class of problems is using too many branch statements
>>>>>> for everything that can't be indexed by current containers.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sat, 4 Apr 2026, 4:24 am Steve Weinrich via Std-Proposals, <
>>>>>> std-proposals_at_[hidden]> wrote:
>>>>>>
>>>>>> Hi Muneem,
>>>>>>
>>>>>>
>>>>>>
>>>>>> You mention “this class of problems” I would like to know what that
>>>>>> is? Please forget about a heterogenous list. What is the root problem
>>>>>> that the heterogenous list solves? Please describe an actual problem, not
>>>>>> a “it would be nice to solve.”
>>>>>>
>>>>>>
>>>>>>
>>>>>> To make this a little easier and to involve less people, here is my
>>>>>> email: weinrich.steve_at_[hidden]
>>>>>>
>>>>>>
>>>>>>
>>>>>> Cheers,
>>>>>> Steve
>>>>>>
>>>>>>
>>>>>>
>>>>>> *From:* Std-Proposals <std-proposals-bounces_at_[hidden]> *On
>>>>>> Behalf Of *Muneem via Std-Proposals
>>>>>> *Sent:* Friday, April 3, 2026 5:17 PM
>>>>>> *To:* std-proposals_at_[hidden]
>>>>>> *Cc:* Muneem <itfllow123_at_[hidden]>
>>>>>> *Subject:* Re: [std-proposals] Fwd: Extension to runtime
>>>>>> polymorphism proposed
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks for your interested ❤️❤️❤️
>>>>>>
>>>>>> If you believe that any other solution maybe a better option for this
>>>>>> class of problems then please let me know. In fact, we would collaborate on
>>>>>> the proposal. This is not at odds with c++ because the (recommended)
>>>>>> semantics is a construct that captures values using const references, as
>>>>>> opposed to storing them directly as a struct would, and for the compiler to
>>>>>> access the value by reference, inline code for each branch, or do what it
>>>>>> thinks it best. It is similiar to the heterogeneous lists provided by GO
>>>>>> but it isn't because it captures by const value reference, and references
>>>>>> aren't the same as non const pointers, in the that they can be optimized
>>>>>> like const pointer can't be. Sorry for writing too much, I just got too
>>>>>> excited by someone taking openly interest in this proposal ❤️❤️❤️❤️.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Regards, Muneem
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sat, 4 Apr 2026, 4:11 am Steve Weinrich via Std-Proposals, <
>>>>>> std-proposals_at_[hidden]> wrote:
>>>>>>
>>>>>> Hi Muneem,
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks. I am interested in the original problem. I am curious if a
>>>>>> heterogeneous list is the optimal solution for the problem at hand. Over
>>>>>> the last 50+ years of programming, I have encountered many questions like
>>>>>> yours, which presume a particular solution. Sometimes that solution is at
>>>>>> odds with the language at hand (or other issues). I find that
>>>>>> understanding the original problem allows me to better understand why one
>>>>>> is suggesting a particular language enhancement.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Cheers,
>>>>>> Steve
>>>>>>
>>>>>>
>>>>>>
>>>>>> *From:* Std-Proposals <std-proposals-bounces_at_[hidden]> *On
>>>>>> Behalf Of *Muneem via Std-Proposals
>>>>>> *Sent:* Friday, April 3, 2026 5:03 PM
>>>>>> *To:* std-proposals_at_[hidden]
>>>>>> *Cc:* Muneem <itfllow123_at_[hidden]>
>>>>>> *Subject:* Re: [std-proposals] Fwd: Extension to runtime
>>>>>> polymorphism proposed
>>>>>>
>>>>>>
>>>>>>
>>>>>> >context for readers:second one was a class of problems fixed by
>>>>>> heterogeneous lists.
>>>>>>
>>>>>> It's the second one, and I am really really sorry for the confusion.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sat, 4 Apr 2026, 3:58 am Steve Weinrich via Std-Proposals, <
>>>>>> std-proposals_at_[hidden]> wrote:
>>>>>>
>>>>>> If I may, a heterogeneous list is not a problem, it is a solution to
>>>>>> a problem. So there are now two possibilities:
>>>>>>
>>>>>> 1. You are trying to create a new std container to solve a class
>>>>>> of problems.
>>>>>> 2. You have some other problem that you have used a heterogenous
>>>>>> list to solve.
>>>>>>
>>>>>>
>>>>>>
>>>>>> What say you?
>>>>>>
>>>>>>
>>>>>>
>>>>>> Cheers,
>>>>>> Steve
>>>>>>
>>>>>>
>>>>>>
>>>>>> *From:* Std-Proposals <std-proposals-bounces_at_[hidden]> *On
>>>>>> Behalf Of *Muneem via Std-Proposals
>>>>>> *Sent:* Friday, April 3, 2026 4:54 PM
>>>>>> *To:* std-proposals_at_[hidden]
>>>>>> *Cc:* Muneem <itfllow123_at_[hidden]>
>>>>>> *Subject:* Re: [std-proposals] Fwd: Extension to runtime
>>>>>> polymorphism proposed
>>>>>>
>>>>>>
>>>>>>
>>>>>> Sorry, I meant "heterogeneous lists is the problem", so really really
>>>>>> sorry for that mistake, It was an auto correct mistake.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sat, 4 Apr 2026, 3:52 am Muneem, <itfllow123_at_[hidden]> wrote:
>>>>>>
>>>>>> Heterogeneous problems is the problem that leads to the code bloat or
>>>>>> verbosity that I described, but it's hard to chose weather chicken came
>>>>>> first or the egg in this regard.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Regards, Muneem
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sat, 4 Apr 2026, 3:50 am Steve Weinrich via Std-Proposals, <
>>>>>> std-proposals_at_[hidden]> wrote:
>>>>>>
>>>>>> Hi Muneem,
>>>>>>
>>>>>>
>>>>>>
>>>>>> If you don’t mind, I would like to limit this portion of our
>>>>>> interaction to simply describing the problem. I want to get a 100%
>>>>>> understanding of the problem, unfettered by any assumptions (including C++
>>>>>> limitations) or previous solutions. Is a heterogenous list actually the
>>>>>> problem or is that simply a solution that you think fits the problem at
>>>>>> hand?
>>>>>>
>>>>>>
>>>>>>
>>>>>> Cheers,
>>>>>> Steve
>>>>>>
>>>>>>
>>>>>>
>>>>>> *From:* Std-Proposals <std-proposals-bounces_at_[hidden]> *On
>>>>>> Behalf Of *Muneem via Std-Proposals
>>>>>> *Sent:* Friday, April 3, 2026 4:40 PM
>>>>>> *To:* std-proposals_at_[hidden]
>>>>>> *Cc:* Muneem <itfllow123_at_[hidden]>
>>>>>> *Subject:* Re: [std-proposals] Fwd: Extension to runtime
>>>>>> polymorphism proposed
>>>>>>
>>>>>>
>>>>>>
>>>>>> Hi!
>>>>>>
>>>>>> Thanks for your response!!!
>>>>>>
>>>>>>
>>>>>>
>>>>>> Yes, your understanding of the problem is mostly correct with a few
>>>>>> details left out. The alternative to using classes would be switch
>>>>>> statements that as discussed would lead to code bloat and might backfire.
>>>>>> The goal is to make a construct that makes the indexing into heterogenous
>>>>>> lists more easy. There is how ever one thing that I would like to clarify:
>>>>>>
>>>>>> The goal isn't the classic polymorphism but to be able to return an
>>>>>> object of any type through a specified interface(indexing heterogenous
>>>>>> lists), which your example dosent do. Basically, just like GO and many
>>>>>> other compiled languages support heterogenous lists, I want c++ to do so as
>>>>>> well. Say you want index a bunch of containers(to use top()
>>>>>> function),std::visit fails because it can't return any return type that you
>>>>>> would wish for, so you can either use switch case to write expression top
>>>>>> for each container or this long verbose technique:
>>>>>>
>>>>>> #include<vector>
>>>>>>
>>>>>> #include<deque>
>>>>>>
>>>>>> template<typename T>
>>>>>>
>>>>>> struct Base{
>>>>>>
>>>>>> virtual T top_wrapper();
>>>>>>
>>>>>> };
>>>>>>
>>>>>>
>>>>>>
>>>>>> template<typename T>
>>>>>>
>>>>>> struct Derived_1: Base<T>, std::vector<T>{
>>>>>>
>>>>>> T top_wrapper() override{
>>>>>>
>>>>>> return T{*this.top()};//T{} is just to show that it works
>>>>>> even if top had some other return type
>>>>>>
>>>>>> }
>>>>>>
>>>>>> };
>>>>>>
>>>>>> template<typename T>
>>>>>>
>>>>>> struct Derived_2: Base<T>, std:: deque<T>{
>>>>>>
>>>>>> T top_wrapper() override{
>>>>>>
>>>>>> return T{*this.top()};//T{} is just to show that it works
>>>>>> even if top had some other return type
>>>>>>
>>>>>> }
>>>>>>
>>>>>> };
>>>>>>
>>>>>> int main(){
>>>>>>
>>>>>> std::vector<Base<int>> a;
>>>>>>
>>>>>> //The compiler would probably optimize this example, but not an
>>>>>> example where you index the vector using real time input
>>>>>>
>>>>>> return 0;
>>>>>>
>>>>>> }
>>>>>>
>>>>>> //An vector of std::variant only works if I am willing to write a
>>>>>> helper top(std::variant<Args...> obj) function that includes std::visit to
>>>>>> call top(), that in of itself is not only verbose but obscures intent and
>>>>>> context.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Regards, Muneem.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sat, 4 Apr 2026, 2:43 am Steve Weinrich via Std-Proposals, <
>>>>>> std-proposals_at_[hidden]> wrote:
>>>>>>
>>>>>> Hi Muneem,
>>>>>>
>>>>>>
>>>>>>
>>>>>> I would like to make sure that I understand this problem before going
>>>>>> on.
>>>>>>
>>>>>>
>>>>>>
>>>>>> I think there are several classes that have a portion (or all) of
>>>>>> their interface in common. Each method of the interface returns a constant
>>>>>> value:
>>>>>>
>>>>>>
>>>>>>
>>>>>> class First
>>>>>>
>>>>>> {
>>>>>>
>>>>>> public:
>>>>>>
>>>>>> int funcA () const { return 1123; }
>>>>>>
>>>>>> int funcB () const ( return 1234; }
>>>>>>
>>>>>> int funcC () const { return 1456; }
>>>>>>
>>>>>> };
>>>>>>
>>>>>>
>>>>>>
>>>>>> class Second
>>>>>>
>>>>>> {
>>>>>>
>>>>>> public:
>>>>>>
>>>>>> int funcA () const { return 2123; }
>>>>>>
>>>>>> int funcB () const ( return 2234; }
>>>>>>
>>>>>> int funcC () const { return 2456; }
>>>>>>
>>>>>> };
>>>>>>
>>>>>>
>>>>>>
>>>>>> class Third
>>>>>>
>>>>>> {
>>>>>>
>>>>>> public:
>>>>>>
>>>>>> int funcA () const { return 3123; }
>>>>>>
>>>>>> int funcB () const ( return 3234; }
>>>>>>
>>>>>> int funcC () const { return 3456; }
>>>>>>
>>>>>> };
>>>>>>
>>>>>>
>>>>>>
>>>>>> 1. We would like a means to be able to add more classes easily.
>>>>>> 2. We would like a means to be able to add to the shared
>>>>>> interface easily.
>>>>>> 3. We would like to be able to use the shared interface in a
>>>>>> polymorphic way (like a virtual method).
>>>>>> 4. Performance is of the utmost importance.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Is my understanding correct?
>>>>>>
>>>>>>
>>>>>>
>>>>>> Steve
>>>>>>
>>>>>> *From:* Std-Proposals <std-proposals-bounces_at_[hidden]> *On
>>>>>> Behalf Of *Muneem via Std-Proposals
>>>>>> *Sent:* Friday, April 3, 2026 1:54 PM
>>>>>> *To:* std-proposals_at_[hidden]
>>>>>> *Cc:* Muneem <itfllow123_at_[hidden]>
>>>>>> *Subject:* Re: [std-proposals] Fwd: Extension to runtime
>>>>>> polymorphism proposed
>>>>>>
>>>>>>
>>>>>>
>>>>>> Sorry for sending two emails at once!
>>>>>>
>>>>>> I just wanted to revise the fact that the point of the whole proposal
>>>>>> is to provide intent, the code that Mr. Maciera was kind enough to bring
>>>>>> forward proves my exact point, that with enough intent, the compiler can
>>>>>> optimize anythjng, and these optimizations grow larger as the scale of the
>>>>>> program grows larger. Microbenchmarks might show a single example but even
>>>>>> that single example should get us thinking that why is it so slow for this
>>>>>> one example? Does this overhead force people to write switch case
>>>>>> statements that can lead to code bloat which can again backfire in terms of
>>>>>> performance?
>>>>>>
>>>>>> Regards, Muneem.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sat, 4 Apr 2026, 12:48 am Muneem, <itfllow123_at_[hidden]> wrote:
>>>>>>
>>>>>> Hi!
>>>>>>
>>>>>> Thanks again for your feedback, Macieira. 👍
>>>>>>
>>>>>> >micro benchmark is misleading
>>>>>>
>>>>>> 1. The reason that I gave you microbenchmarks is that some asked for
>>>>>> it, and even I was too relectunt to use them despite the quote of Bjarne
>>>>>> Stroustrups
>>>>>>
>>>>>> "Don't assume, measure" because in this case, the goal is to either
>>>>>> make the compiler smaller or runtime faster, both of which are targeted by
>>>>>> my new proposal.
>>>>>>
>>>>>> 2. You are right that the compiler might have folded the loop into
>>>>>> half, but the point is that it still shows that the observable behaviour
>>>>>> is the same, infact, if the loop body was to index into a heterogeneous
>>>>>> set(using the proposed construct) and do some operation then the compiler
>>>>>> would optimize the indexing if the source of the index is one. This proves
>>>>>> that intent. An help the compiler do wonders:
>>>>>>
>>>>>> 1.Fold loops even when I used volatile to avoid it.
>>>>>>
>>>>>> 2.Avoid the entire indexing operations (if in a loop with the most
>>>>>> minimal compile time overhead)
>>>>>>
>>>>>> 3. Store the result immediately after it takes input into some memory
>>>>>> location (if that solution is the fasted).
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> 3.Optimize a single expression for the sake of the whole program.
>>>>>>
>>>>>> Currently, the optimizer might in fact be able to optimize checks in
>>>>>> a loop, but it's not as easy or as gurrentied because there are no
>>>>>> semantical promises that we can make with the existing constructs to make
>>>>>> it happen.
>>>>>>
>>>>>> 4.My main point isn't weather my benchmark is correct or wrong, but
>>>>>> rather that expressing intent is better. The bench mark was merely to show
>>>>>> that std::visit is slower (according to g++ and Microsoft visual studio
>>>>>> 2026 compiled programs, using std::chorno and visual studio 2026 CPU usage
>>>>>> measurement tools to prove my point), but even if some compiler or all
>>>>>> compilers optimize their performance; we still have compile time overhead
>>>>>> for taking std::visit and making it faster, and the optimization might
>>>>>> backfire since it would be to optimize single statements independent of
>>>>>> what's in the rest of the program. Why? Because unlike my proposed
>>>>>> construct, std::visit does not have enough context and intent to tell the
>>>>>> compiler what's going on so that it can generate code that has the exact
>>>>>> "book keeping" data and access code that fits the entire program.
>>>>>>
>>>>>>
>>>>>>
>>>>>> 3. In case, someone's think a few nano seconds in a single example
>>>>>> isn't a big deal, then rethink it because if my construct is passed then
>>>>>> yes, it would not be a big deal because the compiler can optimize many
>>>>>> indexing operations into a single heterogenous set and maybe cache the
>>>>>> result afterwards somewhere. The issue is that this can't be done with the
>>>>>> current techniques because of the lack of intent. Compilers are much
>>>>>> smarter than we could ever be because they are work of many people's entire
>>>>>> career, not just one very smart guy from Intel, so blaming/restricting
>>>>>> compilers whose job is to be as general for the sake of the whole program.
>>>>>>
>>>>>> 4.>I suppose it decided to unroll the loop a >bit
>>>>>>
>>>>>> >and made two calls to sink() per loop:
>>>>>>
>>>>>> >template <typename T> void sink(const T >&) { asm volatile("" :::
>>>>>> "memory"); }
>>>>>>
>>>>>> Even if it optimized switch case statement using volatile("" :::
>>>>>> "memory"); but not std::visit
>>>>>>
>>>>>> That's my point isn't that switch case is magically faster, but
>>>>>> rather the compiler has more room to cheat and skip things. Infact the
>>>>>> standard allows it a lot of free room as long as the observable behaviour
>>>>>> is the same, even more so by giving it free room with sets of observable
>>>>>> behaviours (unspecified behaviours)
>>>>>>
>>>>>> 5. Microbe marking wasent to show that std::visit is inherintly
>>>>>> slower, but rather the compiler can and should do mistakes in optimizing
>>>>>> it, in order to avoid massive compile time overhead.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, 3 Apr 2026, 8:33 pm Thiago Macieira via Std-Proposals, <
>>>>>> std-proposals_at_[hidden]> wrote:
>>>>>>
>>>>>> On Thursday, 2 April 2026 19:15:42 Pacific Daylight Time Thiago
>>>>>> Macieira via
>>>>>> Std-Proposals wrote:
>>>>>> > Even in this case, I have profiled the code above (after fixing it
>>>>>> and
>>>>>> > removing the std::cout itself) and found that overall, the switched
>>>>>> case
>>>>>> > ran 2x faster, at 0.113 ns per iteration, while the variant case
>>>>>> required
>>>>>> > 0.227 ns per iteration. Looking at the CPU performance counters, the
>>>>>> > std::variant code has 2 branches per iteration and takes 1 cycle per
>>>>>> > iteration, running at 5 IPC (thus, 5 instructions per iteration).
>>>>>> > Meanwhile, the switched case has 0.5 branch per iteration and takes
>>>>>> 0.5
>>>>>> > cycle per iteration, running at 2 IPC. The half cycle numbers make
>>>>>> sense
>>>>>> > because I believe the two instructions are getting macrofused
>>>>>> together and
>>>>>> > execute as a single uop, which causes confusing numbers.
>>>>>>
>>>>>> This half a cycle and ninth of a nanosecond problem has been on my
>>>>>> mind for a
>>>>>> while. The execution time of anything needs to be a multiple of the
>>>>>> cycle
>>>>>> time, so a CPU running at 4.5 GHz line mine was shouldn't have a
>>>>>> difference of
>>>>>> one ninth of a nanosecond. One explanation would be that somehow the
>>>>>> CPU was
>>>>>> executing two iterations of the loop at the same time, pipelining.
>>>>>>
>>>>>> But disassembling the binary shows a simpler explanation. The switch
>>>>>> loop was:
>>>>>>
>>>>>> 40149f: mov $0x3b9aca00,%eax
>>>>>> 4014a4: nop
>>>>>> 4014a5: data16 cs nopw 0x0(%rax,%rax,1)
>>>>>> 4014b0: sub $0x2,%eax
>>>>>> 4014b3: jne 4014b0
>>>>>>
>>>>>> [Note how there is no test for what was being indexed in the loop!]
>>>>>>
>>>>>> Here's what I had missed: sub $2. I'm not entirely certain what GCC
>>>>>> was
>>>>>> thinking here, but it's subtracting 2 instead of 1, so this looped
>>>>>> half a
>>>>>> billion times (0x3b9aca00 / 2). I suppose it decided to unroll the
>>>>>> loop a bit
>>>>>> and made two calls to sink() per loop:
>>>>>>
>>>>>> template <typename T> void sink(const T &) { asm volatile("" :::
>>>>>> "memory"); }
>>>>>>
>>>>>> But that expanded to nothing in the output. I could add "nop" so we'd
>>>>>> see what
>>>>>> happened and the CPU would be obligated to retire those instructions,
>>>>>> increasing the instruction executed counter (I can't quickly find how
>>>>>> many the
>>>>>> TGL processor / WLC core can retire per cycle, but I recall it's 6,
>>>>>> so adding
>>>>>> 2 more instructions shouldn't affect the execution time). But I don't
>>>>>> think I
>>>>>> need to further benchmark this to prove my point:
>>>>>>
>>>>>> The microbenchmark is misleading.
>>>>>>
>>>>>> --
>>>>>> Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
>>>>>> Principal Engineer - Intel Data Center - Platform & Sys. Eng.
>>>>>> --
>>>>>> Std-Proposals mailing list
>>>>>> Std-Proposals_at_[hidden]
>>>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>>>
>>>>>> --
>>>>>> Std-Proposals mailing list
>>>>>> Std-Proposals_at_[hidden]
>>>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>>>
>>>>>> --
>>>>>> Std-Proposals mailing list
>>>>>> Std-Proposals_at_[hidden]
>>>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>>>
>>>>>> --
>>>>>> Std-Proposals mailing list
>>>>>> Std-Proposals_at_[hidden]
>>>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>>>
>>>>>> --
>>>>>> Std-Proposals mailing list
>>>>>> Std-Proposals_at_[hidden]
>>>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>>>
>>>>>> --
>>>>>> Std-Proposals mailing list
>>>>>> Std-Proposals_at_[hidden]
>>>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>>>
>>>>>> --
>>>>>> Std-Proposals mailing list
>>>>>> Std-Proposals_at_[hidden]
>>>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>>>
>>>>> --
>>>>> Std-Proposals mailing list
>>>>> Std-Proposals_at_[hidden]
>>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>>
>>>> --
>>>> Std-Proposals mailing list
>>>> Std-Proposals_at_[hidden]
>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>
>>>

Received on 2026-04-04 01:57:29