C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Extension to runtime polymorphism proposed

From: Muneem <itfllow123_at_[hidden]>
Date: Sat, 4 Apr 2026 07:06:49 +0500
The problem is why use branch constructs to branch through multiple
branches that call the same functions on a list of objects?

Please tell me what am I missing?
If you want, I can research the exact terms from the standard and replace
my proposal with terms from the standard to describe what I want to propose.
Regards, Muneem

On Sat, 4 Apr 2026, 6:57 am Muneem, <itfllow123_at_[hidden]> wrote:

> In standardese:
> I want the abstract machine to render any of the values chosen at runtime
> to be rendered in place regardless of their type.
>
> On Sat, 4 Apr 2026, 6:54 am Muneem, <itfllow123_at_[hidden]> wrote:
>
>> Think of intermediate code representation as the process where the c++
>> compiler is officially over and the backend of the compiler takes over
>> (llvms that then translate code to assembly).
>>
>> On Sat, 4 Apr 2026, 6:47 am Muneem, <itfllow123_at_[hidden]> wrote:
>>
>>> I really wish that you can advice me on how to present this problem
>>> because it does seem like I may be incapable of putting it into words. Like
>>> please do a counter proposal, I would really appreciate it.
>>> Regards, Muneem.
>>>
>>> On Sat, 4 Apr 2026, 6:43 am Muneem, <itfllow123_at_[hidden]> wrote:
>>>
>>>> I am really really sorry if my response was no satisfactory.
>>>> In short, all I want is to solve the problem where we can't tell the
>>>> compiler to "render an object of any type in place at the intermediate code
>>>> level". This problem is a problem because we have to do it our selves
>>>> unlike in languages like GO. One this problem is solved then heterogenous
>>>> lists would be possible.
>>>> Regards, Muneem
>>>>
>>>> On Sat, 4 Apr 2026, 6:24 am Steve Weinrich via Std-Proposals, <
>>>> std-proposals_at_[hidden]> wrote:
>>>>
>>>>> Hi Maneem,
>>>>>
>>>>> I had hoped by asking you to explain the problem you are trying to
>>>>> solve, I might be able to help you describe things in a way that would make
>>>>> it easier for you to describe the exact language feature you are proposing.
>>>>>
>>>>> Either am not smart enough to understand you (a good probability) or
>>>>> you are incapable of describing what you seek in C++ language terms. Keep
>>>>> in mind that the committee only deals on the language!
>>>>>
>>>>> I wish you all the best.
>>>>>
>>>>> Cheers,
>>>>> Steve
>>>>>
>>>>> On Fri, Apr 3, 2026, 19:09 Muneem via Std-Proposals <
>>>>> std-proposals_at_[hidden]> wrote:
>>>>>
>>>>>> Thank you for your response!
>>>>>> what is the problem(the question you asked)?
>>>>>> There are many possible answers to what your question might have
>>>>>> meant(hope any of these answer you):
>>>>>>
>>>>>> 1.This problem before c++ existed and before it had this static type
>>>>>> system could be quantified as branching overhead and verbosity, and why we
>>>>>> needed better techniques that avoided it.
>>>>>> This wilkipedia article explains the first time that branching was a
>>>>>> problem(all credits to wilkipedia:
>>>>>> https://en.wikipedia.org/wiki/Branch_table):
>>>>>> Use of branch tables and other raw data encoding was common in the
>>>>>> early days of computing when memory was expensive, CPUs were slower and
>>>>>> compact data representation and efficient choice of alternatives were
>>>>>> important. This fact is truer now more than ever. In particular the issue
>>>>>> in branching: pointer redirection using any implementation like vtables or
>>>>>> function pointers. Sometimes, branching may infact be faster, but again,
>>>>>> constructs that give no context and intent to the compiler are hard for the
>>>>>> compiler to decide what to do with. This is why c++ came and provides if
>>>>>> statements, switch statements, ternary statements, and many more so that it
>>>>>> can provide the best intermediate code representation possible. Each type
>>>>>> of branch statements isn't just a new syntax but it makes a user write a
>>>>>> certain way, and let's the compiler do optimizations before the compiler
>>>>>> backend reads the code.
>>>>>>
>>>>>> 2. The problem is the lack of a construct for describing a language
>>>>>> level construct for type erasure that can result in optimized intermediate
>>>>>> representation of the code.
>>>>>>
>>>>>> 3.The problem is virtual functions don't do it, switch case
>>>>>> statements don't do it, nothing does, manual type erasure code dosent so
>>>>>> it. The "it" is type erasure patterns. Switch case (and others) statements
>>>>>> fails completely if you don't write code for each object again and again.
>>>>>>
>>>>>> 4. The problem is verbosity of current branching/polymorphism
>>>>>> techniques for type erasure. Not only that but you can't even overload a
>>>>>> polymorphic function to return a different type based on an argument(unless
>>>>>> the return type is a polymorphic class(known as "Return Type Relaxation" in
>>>>>> C++ 4th by Bajrne Stroustrup" section 20.3.6)) in order to fix this problem
>>>>>> by the visitor pattern or some other double dispatch pattern.
>>>>>>
>>>>>> 5. The problem is lack of clear expression of type erasure.
>>>>>>
>>>>>>
>>>>>> 2. I don't want make an heterogeneous list in the traditional sense,
>>>>>> but rather a list of const references of any type, so it isnt against c++
>>>>>> philosophies, it's just trying to automate the process of manual type
>>>>>> erasure and leave it to the compiler to produce optimal intermediate code
>>>>>> representation based on the specific program, context of every
>>>>>> subscripting, and the source of every subscripting operation. That is as
>>>>>> c++ as one can get. It would not be c++ if I were to just ask for
>>>>>> heterogeneous list that is completely up to the implementation, but rather
>>>>>> I want type erasure for const lvalue references at a language
>>>>>> level(optimized intermediate code representation). Think of it like
>>>>>> templates, if it helps making it easy to reason. In fact, this is a form of
>>>>>> "Return Type Relaxation"(in C++ 4th by Bajrne Stroustrup" section 20.3.6),
>>>>>> but instead of pointers and references, but I want only want to use const
>>>>>> references.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sat, 4 Apr 2026, 5:22 am Steve Weinrich via Std-Proposals, <
>>>>>> std-proposals_at_[hidden]> wrote:
>>>>>>
>>>>>>> Hi Muneem,
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I am not trying to be difficult, but the “problems” that you have
>>>>>>> described are {potentially} the result of implementation choices and/or C++
>>>>>>> limitations. What I am currently interested in is the problem that was
>>>>>>> presented before making those choices.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Let me see if I can give an example. Someone says, I have a
>>>>>>> problem. Every time I insert on object into std::vector, I have to re-sort
>>>>>>> the vector. Obviously, they are using the wrong container. *But
>>>>>>> even if they switch to a std::map, we don’t know if the data truly needs to
>>>>>>> be sorted. Or how frequently it needs to be sorted.*
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> You say, “You want to choose between three container objects but
>>>>>>> they don't have different type.”
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> That could look like:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> std::vector<int> alpha;
>>>>>>>
>>>>>>> std::vector<int> beta;
>>>>>>>
>>>>>>> std::vector<int> gamma;
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> This immediately raises the question, “Why three vectors?”
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> You say, “You want to choose between any three objects but they have
>>>>>>> different types.”
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> That could look like:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> class Alpha
>>>>>>>
>>>>>>> {
>>>>>>>
>>>>>>> public:
>>>>>>>
>>>>>>> int funcA ();
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> private:
>>>>>>>
>>>>>>> // Some Alpha specific data
>>>>>>>
>>>>>>> };
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> class Beta
>>>>>>>
>>>>>>> {
>>>>>>>
>>>>>>> public:
>>>>>>>
>>>>>>> int funcA ();
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> private:
>>>>>>>
>>>>>>> // Some Beta specific data
>>>>>>>
>>>>>>> };
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> class Gamma
>>>>>>>
>>>>>>> {
>>>>>>>
>>>>>>> public:
>>>>>>>
>>>>>>> int funcA ();
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> private:
>>>>>>>
>>>>>>> // Some Gamma specific data
>>>>>>>
>>>>>>> };
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> That immediately raises the question, “Will virtual functions work?”
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> So I ask again, what is the problem (not the solution)?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Steve
>>>>>>>
>>>>>>> *From:* Std-Proposals <std-proposals-bounces_at_[hidden]> *On
>>>>>>> Behalf Of *Muneem via Std-Proposals
>>>>>>> *Sent:* Friday, April 3, 2026 6:03 PM
>>>>>>> *To:* std-proposals_at_[hidden]
>>>>>>> *Cc:* Muneem <itfllow123_at_[hidden]>
>>>>>>> *Subject:* Re: [std-proposals] Fwd: Extension to runtime
>>>>>>> polymorphism proposed
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> An extended reason to why this class of problems described in the
>>>>>>> previous emails exists is because the current constructs dosent give enough
>>>>>>> context to intermediate code generation, and solely rely on llvms or other
>>>>>>> backends to be advanced enough, which for languages don't make sense
>>>>>>> because language are meant to be less verbose and more explicit than a
>>>>>>> compiler backend code generation tools.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> (Really Sorry for sending two emails at one)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Regards, Muneem.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sat, 4 Apr 2026, 4:54 am Muneem, <itfllow123_at_[hidden]> wrote:
>>>>>>>
>>>>>>> The actual class of problems is code repitition and obscurement of
>>>>>>> intent:
>>>>>>>
>>>>>>> 1.Say you want to choose between three container objects but they
>>>>>>> don't have different type.
>>>>>>>
>>>>>>> 2. Say you want to choose between any three objects but they have
>>>>>>> different types.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> The current fix for both, to allocate those objects in a
>>>>>>> std::vector/array of std::variant element type or to use switch statements.
>>>>>>> All of these obscure intent making it hard to optimize code in intermediate
>>>>>>> code generation. This is actually more common than we think, infact, we can
>>>>>>> destroy 99% of germs(branch statements) with function objects that can be
>>>>>>> indexed.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> In short the class of problems is using too many branch statements
>>>>>>> for everything that can't be indexed by current containers.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sat, 4 Apr 2026, 4:24 am Steve Weinrich via Std-Proposals, <
>>>>>>> std-proposals_at_[hidden]> wrote:
>>>>>>>
>>>>>>> Hi Muneem,
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> You mention “this class of problems” I would like to know what that
>>>>>>> is? Please forget about a heterogenous list. What is the root problem
>>>>>>> that the heterogenous list solves? Please describe an actual problem, not
>>>>>>> a “it would be nice to solve.”
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> To make this a little easier and to involve less people, here is my
>>>>>>> email: weinrich.steve_at_[hidden]
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Steve
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *From:* Std-Proposals <std-proposals-bounces_at_[hidden]> *On
>>>>>>> Behalf Of *Muneem via Std-Proposals
>>>>>>> *Sent:* Friday, April 3, 2026 5:17 PM
>>>>>>> *To:* std-proposals_at_[hidden]
>>>>>>> *Cc:* Muneem <itfllow123_at_[hidden]>
>>>>>>> *Subject:* Re: [std-proposals] Fwd: Extension to runtime
>>>>>>> polymorphism proposed
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Thanks for your interested ❤️❤️❤️
>>>>>>>
>>>>>>> If you believe that any other solution maybe a better option for
>>>>>>> this class of problems then please let me know. In fact, we would
>>>>>>> collaborate on the proposal. This is not at odds with c++ because the
>>>>>>> (recommended) semantics is a construct that captures values using const
>>>>>>> references, as opposed to storing them directly as a struct would, and for
>>>>>>> the compiler to access the value by reference, inline code for each branch,
>>>>>>> or do what it thinks it best. It is similiar to the heterogeneous lists
>>>>>>> provided by GO but it isn't because it captures by const value reference,
>>>>>>> and references aren't the same as non const pointers, in the that they can
>>>>>>> be optimized like const pointer can't be. Sorry for writing too much, I
>>>>>>> just got too excited by someone taking openly interest in this proposal
>>>>>>> ❤️❤️❤️❤️.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Regards, Muneem
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sat, 4 Apr 2026, 4:11 am Steve Weinrich via Std-Proposals, <
>>>>>>> std-proposals_at_[hidden]> wrote:
>>>>>>>
>>>>>>> Hi Muneem,
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Thanks. I am interested in the original problem. I am curious if a
>>>>>>> heterogeneous list is the optimal solution for the problem at hand. Over
>>>>>>> the last 50+ years of programming, I have encountered many questions like
>>>>>>> yours, which presume a particular solution. Sometimes that solution is at
>>>>>>> odds with the language at hand (or other issues). I find that
>>>>>>> understanding the original problem allows me to better understand why one
>>>>>>> is suggesting a particular language enhancement.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Steve
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *From:* Std-Proposals <std-proposals-bounces_at_[hidden]> *On
>>>>>>> Behalf Of *Muneem via Std-Proposals
>>>>>>> *Sent:* Friday, April 3, 2026 5:03 PM
>>>>>>> *To:* std-proposals_at_[hidden]
>>>>>>> *Cc:* Muneem <itfllow123_at_[hidden]>
>>>>>>> *Subject:* Re: [std-proposals] Fwd: Extension to runtime
>>>>>>> polymorphism proposed
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> >context for readers:second one was a class of problems fixed by
>>>>>>> heterogeneous lists.
>>>>>>>
>>>>>>> It's the second one, and I am really really sorry for the confusion.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sat, 4 Apr 2026, 3:58 am Steve Weinrich via Std-Proposals, <
>>>>>>> std-proposals_at_[hidden]> wrote:
>>>>>>>
>>>>>>> If I may, a heterogeneous list is not a problem, it is a solution
>>>>>>> to a problem. So there are now two possibilities:
>>>>>>>
>>>>>>> 1. You are trying to create a new std container to solve a class
>>>>>>> of problems.
>>>>>>> 2. You have some other problem that you have used a heterogenous
>>>>>>> list to solve.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> What say you?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Steve
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *From:* Std-Proposals <std-proposals-bounces_at_[hidden]> *On
>>>>>>> Behalf Of *Muneem via Std-Proposals
>>>>>>> *Sent:* Friday, April 3, 2026 4:54 PM
>>>>>>> *To:* std-proposals_at_[hidden]
>>>>>>> *Cc:* Muneem <itfllow123_at_[hidden]>
>>>>>>> *Subject:* Re: [std-proposals] Fwd: Extension to runtime
>>>>>>> polymorphism proposed
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Sorry, I meant "heterogeneous lists is the problem", so really
>>>>>>> really sorry for that mistake, It was an auto correct mistake.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sat, 4 Apr 2026, 3:52 am Muneem, <itfllow123_at_[hidden]> wrote:
>>>>>>>
>>>>>>> Heterogeneous problems is the problem that leads to the code bloat
>>>>>>> or verbosity that I described, but it's hard to chose weather chicken came
>>>>>>> first or the egg in this regard.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Regards, Muneem
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sat, 4 Apr 2026, 3:50 am Steve Weinrich via Std-Proposals, <
>>>>>>> std-proposals_at_[hidden]> wrote:
>>>>>>>
>>>>>>> Hi Muneem,
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> If you don’t mind, I would like to limit this portion of our
>>>>>>> interaction to simply describing the problem. I want to get a 100%
>>>>>>> understanding of the problem, unfettered by any assumptions (including C++
>>>>>>> limitations) or previous solutions. Is a heterogenous list actually the
>>>>>>> problem or is that simply a solution that you think fits the problem at
>>>>>>> hand?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Steve
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *From:* Std-Proposals <std-proposals-bounces_at_[hidden]> *On
>>>>>>> Behalf Of *Muneem via Std-Proposals
>>>>>>> *Sent:* Friday, April 3, 2026 4:40 PM
>>>>>>> *To:* std-proposals_at_[hidden]
>>>>>>> *Cc:* Muneem <itfllow123_at_[hidden]>
>>>>>>> *Subject:* Re: [std-proposals] Fwd: Extension to runtime
>>>>>>> polymorphism proposed
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Hi!
>>>>>>>
>>>>>>> Thanks for your response!!!
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Yes, your understanding of the problem is mostly correct with a few
>>>>>>> details left out. The alternative to using classes would be switch
>>>>>>> statements that as discussed would lead to code bloat and might backfire.
>>>>>>> The goal is to make a construct that makes the indexing into heterogenous
>>>>>>> lists more easy. There is how ever one thing that I would like to clarify:
>>>>>>>
>>>>>>> The goal isn't the classic polymorphism but to be able to return an
>>>>>>> object of any type through a specified interface(indexing heterogenous
>>>>>>> lists), which your example dosent do. Basically, just like GO and many
>>>>>>> other compiled languages support heterogenous lists, I want c++ to do so as
>>>>>>> well. Say you want index a bunch of containers(to use top()
>>>>>>> function),std::visit fails because it can't return any return type that you
>>>>>>> would wish for, so you can either use switch case to write expression top
>>>>>>> for each container or this long verbose technique:
>>>>>>>
>>>>>>> #include<vector>
>>>>>>>
>>>>>>> #include<deque>
>>>>>>>
>>>>>>> template<typename T>
>>>>>>>
>>>>>>> struct Base{
>>>>>>>
>>>>>>> virtual T top_wrapper();
>>>>>>>
>>>>>>> };
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> template<typename T>
>>>>>>>
>>>>>>> struct Derived_1: Base<T>, std::vector<T>{
>>>>>>>
>>>>>>> T top_wrapper() override{
>>>>>>>
>>>>>>> return T{*this.top()};//T{} is just to show that it works
>>>>>>> even if top had some other return type
>>>>>>>
>>>>>>> }
>>>>>>>
>>>>>>> };
>>>>>>>
>>>>>>> template<typename T>
>>>>>>>
>>>>>>> struct Derived_2: Base<T>, std:: deque<T>{
>>>>>>>
>>>>>>> T top_wrapper() override{
>>>>>>>
>>>>>>> return T{*this.top()};//T{} is just to show that it works
>>>>>>> even if top had some other return type
>>>>>>>
>>>>>>> }
>>>>>>>
>>>>>>> };
>>>>>>>
>>>>>>> int main(){
>>>>>>>
>>>>>>> std::vector<Base<int>> a;
>>>>>>>
>>>>>>> //The compiler would probably optimize this example, but not an
>>>>>>> example where you index the vector using real time input
>>>>>>>
>>>>>>> return 0;
>>>>>>>
>>>>>>> }
>>>>>>>
>>>>>>> //An vector of std::variant only works if I am willing to write a
>>>>>>> helper top(std::variant<Args...> obj) function that includes std::visit to
>>>>>>> call top(), that in of itself is not only verbose but obscures intent and
>>>>>>> context.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Regards, Muneem.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sat, 4 Apr 2026, 2:43 am Steve Weinrich via Std-Proposals, <
>>>>>>> std-proposals_at_[hidden]> wrote:
>>>>>>>
>>>>>>> Hi Muneem,
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I would like to make sure that I understand this problem before
>>>>>>> going on.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I think there are several classes that have a portion (or all) of
>>>>>>> their interface in common. Each method of the interface returns a constant
>>>>>>> value:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> class First
>>>>>>>
>>>>>>> {
>>>>>>>
>>>>>>> public:
>>>>>>>
>>>>>>> int funcA () const { return 1123; }
>>>>>>>
>>>>>>> int funcB () const ( return 1234; }
>>>>>>>
>>>>>>> int funcC () const { return 1456; }
>>>>>>>
>>>>>>> };
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> class Second
>>>>>>>
>>>>>>> {
>>>>>>>
>>>>>>> public:
>>>>>>>
>>>>>>> int funcA () const { return 2123; }
>>>>>>>
>>>>>>> int funcB () const ( return 2234; }
>>>>>>>
>>>>>>> int funcC () const { return 2456; }
>>>>>>>
>>>>>>> };
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> class Third
>>>>>>>
>>>>>>> {
>>>>>>>
>>>>>>> public:
>>>>>>>
>>>>>>> int funcA () const { return 3123; }
>>>>>>>
>>>>>>> int funcB () const ( return 3234; }
>>>>>>>
>>>>>>> int funcC () const { return 3456; }
>>>>>>>
>>>>>>> };
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 1. We would like a means to be able to add more classes easily.
>>>>>>> 2. We would like a means to be able to add to the shared
>>>>>>> interface easily.
>>>>>>> 3. We would like to be able to use the shared interface in a
>>>>>>> polymorphic way (like a virtual method).
>>>>>>> 4. Performance is of the utmost importance.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Is my understanding correct?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Steve
>>>>>>>
>>>>>>> *From:* Std-Proposals <std-proposals-bounces_at_[hidden]> *On
>>>>>>> Behalf Of *Muneem via Std-Proposals
>>>>>>> *Sent:* Friday, April 3, 2026 1:54 PM
>>>>>>> *To:* std-proposals_at_[hidden]
>>>>>>> *Cc:* Muneem <itfllow123_at_[hidden]>
>>>>>>> *Subject:* Re: [std-proposals] Fwd: Extension to runtime
>>>>>>> polymorphism proposed
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Sorry for sending two emails at once!
>>>>>>>
>>>>>>> I just wanted to revise the fact that the point of the whole
>>>>>>> proposal is to provide intent, the code that Mr. Maciera was kind enough to
>>>>>>> bring forward proves my exact point, that with enough intent, the compiler
>>>>>>> can optimize anythjng, and these optimizations grow larger as the scale of
>>>>>>> the program grows larger. Microbenchmarks might show a single example but
>>>>>>> even that single example should get us thinking that why is it so slow for
>>>>>>> this one example? Does this overhead force people to write switch case
>>>>>>> statements that can lead to code bloat which can again backfire in terms of
>>>>>>> performance?
>>>>>>>
>>>>>>> Regards, Muneem.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sat, 4 Apr 2026, 12:48 am Muneem, <itfllow123_at_[hidden]> wrote:
>>>>>>>
>>>>>>> Hi!
>>>>>>>
>>>>>>> Thanks again for your feedback, Macieira. 👍
>>>>>>>
>>>>>>> >micro benchmark is misleading
>>>>>>>
>>>>>>> 1. The reason that I gave you microbenchmarks is that some asked for
>>>>>>> it, and even I was too relectunt to use them despite the quote of Bjarne
>>>>>>> Stroustrups
>>>>>>>
>>>>>>> "Don't assume, measure" because in this case, the goal is to either
>>>>>>> make the compiler smaller or runtime faster, both of which are targeted by
>>>>>>> my new proposal.
>>>>>>>
>>>>>>> 2. You are right that the compiler might have folded the loop into
>>>>>>> half, but the point is that it still shows that the observable behaviour
>>>>>>> is the same, infact, if the loop body was to index into a heterogeneous
>>>>>>> set(using the proposed construct) and do some operation then the compiler
>>>>>>> would optimize the indexing if the source of the index is one. This proves
>>>>>>> that intent. An help the compiler do wonders:
>>>>>>>
>>>>>>> 1.Fold loops even when I used volatile to avoid it.
>>>>>>>
>>>>>>> 2.Avoid the entire indexing operations (if in a loop with the most
>>>>>>> minimal compile time overhead)
>>>>>>>
>>>>>>> 3. Store the result immediately after it takes input into some
>>>>>>> memory location (if that solution is the fasted).
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 3.Optimize a single expression for the sake of the whole program.
>>>>>>>
>>>>>>> Currently, the optimizer might in fact be able to optimize checks in
>>>>>>> a loop, but it's not as easy or as gurrentied because there are no
>>>>>>> semantical promises that we can make with the existing constructs to make
>>>>>>> it happen.
>>>>>>>
>>>>>>> 4.My main point isn't weather my benchmark is correct or wrong, but
>>>>>>> rather that expressing intent is better. The bench mark was merely to show
>>>>>>> that std::visit is slower (according to g++ and Microsoft visual studio
>>>>>>> 2026 compiled programs, using std::chorno and visual studio 2026 CPU usage
>>>>>>> measurement tools to prove my point), but even if some compiler or all
>>>>>>> compilers optimize their performance; we still have compile time overhead
>>>>>>> for taking std::visit and making it faster, and the optimization might
>>>>>>> backfire since it would be to optimize single statements independent of
>>>>>>> what's in the rest of the program. Why? Because unlike my proposed
>>>>>>> construct, std::visit does not have enough context and intent to tell the
>>>>>>> compiler what's going on so that it can generate code that has the exact
>>>>>>> "book keeping" data and access code that fits the entire program.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 3. In case, someone's think a few nano seconds in a single example
>>>>>>> isn't a big deal, then rethink it because if my construct is passed then
>>>>>>> yes, it would not be a big deal because the compiler can optimize many
>>>>>>> indexing operations into a single heterogenous set and maybe cache the
>>>>>>> result afterwards somewhere. The issue is that this can't be done with the
>>>>>>> current techniques because of the lack of intent. Compilers are much
>>>>>>> smarter than we could ever be because they are work of many people's entire
>>>>>>> career, not just one very smart guy from Intel, so blaming/restricting
>>>>>>> compilers whose job is to be as general for the sake of the whole program.
>>>>>>>
>>>>>>> 4.>I suppose it decided to unroll the loop a >bit
>>>>>>>
>>>>>>> >and made two calls to sink() per loop:
>>>>>>>
>>>>>>> >template <typename T> void sink(const T >&) { asm volatile("" :::
>>>>>>> "memory"); }
>>>>>>>
>>>>>>> Even if it optimized switch case statement using volatile("" :::
>>>>>>> "memory"); but not std::visit
>>>>>>>
>>>>>>> That's my point isn't that switch case is magically faster, but
>>>>>>> rather the compiler has more room to cheat and skip things. Infact the
>>>>>>> standard allows it a lot of free room as long as the observable behaviour
>>>>>>> is the same, even more so by giving it free room with sets of observable
>>>>>>> behaviours (unspecified behaviours)
>>>>>>>
>>>>>>> 5. Microbe marking wasent to show that std::visit is inherintly
>>>>>>> slower, but rather the compiler can and should do mistakes in optimizing
>>>>>>> it, in order to avoid massive compile time overhead.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, 3 Apr 2026, 8:33 pm Thiago Macieira via Std-Proposals, <
>>>>>>> std-proposals_at_[hidden]> wrote:
>>>>>>>
>>>>>>> On Thursday, 2 April 2026 19:15:42 Pacific Daylight Time Thiago
>>>>>>> Macieira via
>>>>>>> Std-Proposals wrote:
>>>>>>> > Even in this case, I have profiled the code above (after fixing it
>>>>>>> and
>>>>>>> > removing the std::cout itself) and found that overall, the
>>>>>>> switched case
>>>>>>> > ran 2x faster, at 0.113 ns per iteration, while the variant case
>>>>>>> required
>>>>>>> > 0.227 ns per iteration. Looking at the CPU performance counters,
>>>>>>> the
>>>>>>> > std::variant code has 2 branches per iteration and takes 1 cycle
>>>>>>> per
>>>>>>> > iteration, running at 5 IPC (thus, 5 instructions per iteration).
>>>>>>> > Meanwhile, the switched case has 0.5 branch per iteration and
>>>>>>> takes 0.5
>>>>>>> > cycle per iteration, running at 2 IPC. The half cycle numbers make
>>>>>>> sense
>>>>>>> > because I believe the two instructions are getting macrofused
>>>>>>> together and
>>>>>>> > execute as a single uop, which causes confusing numbers.
>>>>>>>
>>>>>>> This half a cycle and ninth of a nanosecond problem has been on my
>>>>>>> mind for a
>>>>>>> while. The execution time of anything needs to be a multiple of the
>>>>>>> cycle
>>>>>>> time, so a CPU running at 4.5 GHz line mine was shouldn't have a
>>>>>>> difference of
>>>>>>> one ninth of a nanosecond. One explanation would be that somehow the
>>>>>>> CPU was
>>>>>>> executing two iterations of the loop at the same time, pipelining.
>>>>>>>
>>>>>>> But disassembling the binary shows a simpler explanation. The switch
>>>>>>> loop was:
>>>>>>>
>>>>>>> 40149f: mov $0x3b9aca00,%eax
>>>>>>> 4014a4: nop
>>>>>>> 4014a5: data16 cs nopw 0x0(%rax,%rax,1)
>>>>>>> 4014b0: sub $0x2,%eax
>>>>>>> 4014b3: jne 4014b0
>>>>>>>
>>>>>>> [Note how there is no test for what was being indexed in the loop!]
>>>>>>>
>>>>>>> Here's what I had missed: sub $2. I'm not entirely certain what GCC
>>>>>>> was
>>>>>>> thinking here, but it's subtracting 2 instead of 1, so this looped
>>>>>>> half a
>>>>>>> billion times (0x3b9aca00 / 2). I suppose it decided to unroll the
>>>>>>> loop a bit
>>>>>>> and made two calls to sink() per loop:
>>>>>>>
>>>>>>> template <typename T> void sink(const T &) { asm volatile("" :::
>>>>>>> "memory"); }
>>>>>>>
>>>>>>> But that expanded to nothing in the output. I could add "nop" so
>>>>>>> we'd see what
>>>>>>> happened and the CPU would be obligated to retire those
>>>>>>> instructions,
>>>>>>> increasing the instruction executed counter (I can't quickly find
>>>>>>> how many the
>>>>>>> TGL processor / WLC core can retire per cycle, but I recall it's 6,
>>>>>>> so adding
>>>>>>> 2 more instructions shouldn't affect the execution time). But I
>>>>>>> don't think I
>>>>>>> need to further benchmark this to prove my point:
>>>>>>>
>>>>>>> The microbenchmark is misleading.
>>>>>>>
>>>>>>> --
>>>>>>> Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
>>>>>>> Principal Engineer - Intel Data Center - Platform & Sys. Eng.
>>>>>>> --
>>>>>>> Std-Proposals mailing list
>>>>>>> Std-Proposals_at_[hidden]
>>>>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>>>>
>>>>>>> --
>>>>>>> Std-Proposals mailing list
>>>>>>> Std-Proposals_at_[hidden]
>>>>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>>>>
>>>>>>> --
>>>>>>> Std-Proposals mailing list
>>>>>>> Std-Proposals_at_[hidden]
>>>>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>>>>
>>>>>>> --
>>>>>>> Std-Proposals mailing list
>>>>>>> Std-Proposals_at_[hidden]
>>>>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>>>>
>>>>>>> --
>>>>>>> Std-Proposals mailing list
>>>>>>> Std-Proposals_at_[hidden]
>>>>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>>>>
>>>>>>> --
>>>>>>> Std-Proposals mailing list
>>>>>>> Std-Proposals_at_[hidden]
>>>>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>>>>
>>>>>>> --
>>>>>>> Std-Proposals mailing list
>>>>>>> Std-Proposals_at_[hidden]
>>>>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>>>>
>>>>>> --
>>>>>> Std-Proposals mailing list
>>>>>> Std-Proposals_at_[hidden]
>>>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>>>
>>>>> --
>>>>> Std-Proposals mailing list
>>>>> Std-Proposals_at_[hidden]
>>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>>
>>>>

Received on 2026-04-04 02:07:06