Date: Sat, 4 Apr 2026 06:47:19 +0500
I really wish that you can advice me on how to present this problem because
it does seem like I may be incapable of putting it into words. Like please
do a counter proposal, I would really appreciate it.
Regards, Muneem.
On Sat, 4 Apr 2026, 6:43 am Muneem, <itfllow123_at_[hidden]> wrote:
> I am really really sorry if my response was no satisfactory.
> In short, all I want is to solve the problem where we can't tell the
> compiler to "render an object of any type in place at the intermediate code
> level". This problem is a problem because we have to do it our selves
> unlike in languages like GO. One this problem is solved then heterogenous
> lists would be possible.
> Regards, Muneem
>
> On Sat, 4 Apr 2026, 6:24 am Steve Weinrich via Std-Proposals, <
> std-proposals_at_[hidden]> wrote:
>
>> Hi Maneem,
>>
>> I had hoped by asking you to explain the problem you are trying to solve,
>> I might be able to help you describe things in a way that would make it
>> easier for you to describe the exact language feature you are proposing.
>>
>> Either am not smart enough to understand you (a good probability) or you
>> are incapable of describing what you seek in C++ language terms. Keep in
>> mind that the committee only deals on the language!
>>
>> I wish you all the best.
>>
>> Cheers,
>> Steve
>>
>> On Fri, Apr 3, 2026, 19:09 Muneem via Std-Proposals <
>> std-proposals_at_[hidden]> wrote:
>>
>>> Thank you for your response!
>>> what is the problem(the question you asked)?
>>> There are many possible answers to what your question might have
>>> meant(hope any of these answer you):
>>>
>>> 1.This problem before c++ existed and before it had this static type
>>> system could be quantified as branching overhead and verbosity, and why we
>>> needed better techniques that avoided it.
>>> This wilkipedia article explains the first time that branching was a
>>> problem(all credits to wilkipedia:
>>> https://en.wikipedia.org/wiki/Branch_table):
>>> Use of branch tables and other raw data encoding was common in the early
>>> days of computing when memory was expensive, CPUs were slower and compact
>>> data representation and efficient choice of alternatives were important.
>>> This fact is truer now more than ever. In particular the issue in
>>> branching: pointer redirection using any implementation like vtables or
>>> function pointers. Sometimes, branching may infact be faster, but again,
>>> constructs that give no context and intent to the compiler are hard for the
>>> compiler to decide what to do with. This is why c++ came and provides if
>>> statements, switch statements, ternary statements, and many more so that it
>>> can provide the best intermediate code representation possible. Each type
>>> of branch statements isn't just a new syntax but it makes a user write a
>>> certain way, and let's the compiler do optimizations before the compiler
>>> backend reads the code.
>>>
>>> 2. The problem is the lack of a construct for describing a language
>>> level construct for type erasure that can result in optimized intermediate
>>> representation of the code.
>>>
>>> 3.The problem is virtual functions don't do it, switch case statements
>>> don't do it, nothing does, manual type erasure code dosent so it. The "it"
>>> is type erasure patterns. Switch case (and others) statements fails
>>> completely if you don't write code for each object again and again.
>>>
>>> 4. The problem is verbosity of current branching/polymorphism techniques
>>> for type erasure. Not only that but you can't even overload a polymorphic
>>> function to return a different type based on an argument(unless the return
>>> type is a polymorphic class(known as "Return Type Relaxation" in C++ 4th by
>>> Bajrne Stroustrup" section 20.3.6)) in order to fix this problem by the
>>> visitor pattern or some other double dispatch pattern.
>>>
>>> 5. The problem is lack of clear expression of type erasure.
>>>
>>>
>>> 2. I don't want make an heterogeneous list in the traditional sense, but
>>> rather a list of const references of any type, so it isnt against c++
>>> philosophies, it's just trying to automate the process of manual type
>>> erasure and leave it to the compiler to produce optimal intermediate code
>>> representation based on the specific program, context of every
>>> subscripting, and the source of every subscripting operation. That is as
>>> c++ as one can get. It would not be c++ if I were to just ask for
>>> heterogeneous list that is completely up to the implementation, but rather
>>> I want type erasure for const lvalue references at a language
>>> level(optimized intermediate code representation). Think of it like
>>> templates, if it helps making it easy to reason. In fact, this is a form of
>>> "Return Type Relaxation"(in C++ 4th by Bajrne Stroustrup" section 20.3.6),
>>> but instead of pointers and references, but I want only want to use const
>>> references.
>>>
>>>
>>>
>>> On Sat, 4 Apr 2026, 5:22 am Steve Weinrich via Std-Proposals, <
>>> std-proposals_at_[hidden]> wrote:
>>>
>>>> Hi Muneem,
>>>>
>>>>
>>>>
>>>> I am not trying to be difficult, but the “problems” that you have
>>>> described are {potentially} the result of implementation choices and/or C++
>>>> limitations. What I am currently interested in is the problem that was
>>>> presented before making those choices.
>>>>
>>>>
>>>>
>>>> Let me see if I can give an example. Someone says, I have a problem.
>>>> Every time I insert on object into std::vector, I have to re-sort the
>>>> vector. Obviously, they are using the wrong container. *But even if
>>>> they switch to a std::map, we don’t know if the data truly needs to be
>>>> sorted. Or how frequently it needs to be sorted.*
>>>>
>>>>
>>>>
>>>> You say, “You want to choose between three container objects but they
>>>> don't have different type.”
>>>>
>>>>
>>>>
>>>> That could look like:
>>>>
>>>>
>>>>
>>>> std::vector<int> alpha;
>>>>
>>>> std::vector<int> beta;
>>>>
>>>> std::vector<int> gamma;
>>>>
>>>>
>>>>
>>>> This immediately raises the question, “Why three vectors?”
>>>>
>>>>
>>>>
>>>> You say, “You want to choose between any three objects but they have
>>>> different types.”
>>>>
>>>>
>>>>
>>>> That could look like:
>>>>
>>>>
>>>>
>>>> class Alpha
>>>>
>>>> {
>>>>
>>>> public:
>>>>
>>>> int funcA ();
>>>>
>>>>
>>>>
>>>> private:
>>>>
>>>> // Some Alpha specific data
>>>>
>>>> };
>>>>
>>>>
>>>>
>>>> class Beta
>>>>
>>>> {
>>>>
>>>> public:
>>>>
>>>> int funcA ();
>>>>
>>>>
>>>>
>>>> private:
>>>>
>>>> // Some Beta specific data
>>>>
>>>> };
>>>>
>>>>
>>>>
>>>> class Gamma
>>>>
>>>> {
>>>>
>>>> public:
>>>>
>>>> int funcA ();
>>>>
>>>>
>>>>
>>>> private:
>>>>
>>>> // Some Gamma specific data
>>>>
>>>> };
>>>>
>>>>
>>>>
>>>> That immediately raises the question, “Will virtual functions work?”
>>>>
>>>>
>>>>
>>>> So I ask again, what is the problem (not the solution)?
>>>>
>>>>
>>>>
>>>> Cheers,
>>>> Steve
>>>>
>>>> *From:* Std-Proposals <std-proposals-bounces_at_[hidden]> *On
>>>> Behalf Of *Muneem via Std-Proposals
>>>> *Sent:* Friday, April 3, 2026 6:03 PM
>>>> *To:* std-proposals_at_[hidden]
>>>> *Cc:* Muneem <itfllow123_at_[hidden]>
>>>> *Subject:* Re: [std-proposals] Fwd: Extension to runtime polymorphism
>>>> proposed
>>>>
>>>>
>>>>
>>>> An extended reason to why this class of problems described in the
>>>> previous emails exists is because the current constructs dosent give enough
>>>> context to intermediate code generation, and solely rely on llvms or other
>>>> backends to be advanced enough, which for languages don't make sense
>>>> because language are meant to be less verbose and more explicit than a
>>>> compiler backend code generation tools.
>>>>
>>>>
>>>>
>>>> (Really Sorry for sending two emails at one)
>>>>
>>>>
>>>>
>>>> Regards, Muneem.
>>>>
>>>>
>>>>
>>>> On Sat, 4 Apr 2026, 4:54 am Muneem, <itfllow123_at_[hidden]> wrote:
>>>>
>>>> The actual class of problems is code repitition and obscurement of
>>>> intent:
>>>>
>>>> 1.Say you want to choose between three container objects but they don't
>>>> have different type.
>>>>
>>>> 2. Say you want to choose between any three objects but they have
>>>> different types.
>>>>
>>>>
>>>>
>>>> The current fix for both, to allocate those objects in a
>>>> std::vector/array of std::variant element type or to use switch statements.
>>>> All of these obscure intent making it hard to optimize code in intermediate
>>>> code generation. This is actually more common than we think, infact, we can
>>>> destroy 99% of germs(branch statements) with function objects that can be
>>>> indexed.
>>>>
>>>>
>>>>
>>>> In short the class of problems is using too many branch statements for
>>>> everything that can't be indexed by current containers.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Sat, 4 Apr 2026, 4:24 am Steve Weinrich via Std-Proposals, <
>>>> std-proposals_at_[hidden]> wrote:
>>>>
>>>> Hi Muneem,
>>>>
>>>>
>>>>
>>>> You mention “this class of problems” I would like to know what that
>>>> is? Please forget about a heterogenous list. What is the root problem
>>>> that the heterogenous list solves? Please describe an actual problem, not
>>>> a “it would be nice to solve.”
>>>>
>>>>
>>>>
>>>> To make this a little easier and to involve less people, here is my
>>>> email: weinrich.steve_at_[hidden]
>>>>
>>>>
>>>>
>>>> Cheers,
>>>> Steve
>>>>
>>>>
>>>>
>>>> *From:* Std-Proposals <std-proposals-bounces_at_[hidden]> *On
>>>> Behalf Of *Muneem via Std-Proposals
>>>> *Sent:* Friday, April 3, 2026 5:17 PM
>>>> *To:* std-proposals_at_[hidden]
>>>> *Cc:* Muneem <itfllow123_at_[hidden]>
>>>> *Subject:* Re: [std-proposals] Fwd: Extension to runtime polymorphism
>>>> proposed
>>>>
>>>>
>>>>
>>>> Thanks for your interested ❤️❤️❤️
>>>>
>>>> If you believe that any other solution maybe a better option for this
>>>> class of problems then please let me know. In fact, we would collaborate on
>>>> the proposal. This is not at odds with c++ because the (recommended)
>>>> semantics is a construct that captures values using const references, as
>>>> opposed to storing them directly as a struct would, and for the compiler to
>>>> access the value by reference, inline code for each branch, or do what it
>>>> thinks it best. It is similiar to the heterogeneous lists provided by GO
>>>> but it isn't because it captures by const value reference, and references
>>>> aren't the same as non const pointers, in the that they can be optimized
>>>> like const pointer can't be. Sorry for writing too much, I just got too
>>>> excited by someone taking openly interest in this proposal ❤️❤️❤️❤️.
>>>>
>>>>
>>>>
>>>> Regards, Muneem
>>>>
>>>>
>>>>
>>>> On Sat, 4 Apr 2026, 4:11 am Steve Weinrich via Std-Proposals, <
>>>> std-proposals_at_[hidden]> wrote:
>>>>
>>>> Hi Muneem,
>>>>
>>>>
>>>>
>>>> Thanks. I am interested in the original problem. I am curious if a
>>>> heterogeneous list is the optimal solution for the problem at hand. Over
>>>> the last 50+ years of programming, I have encountered many questions like
>>>> yours, which presume a particular solution. Sometimes that solution is at
>>>> odds with the language at hand (or other issues). I find that
>>>> understanding the original problem allows me to better understand why one
>>>> is suggesting a particular language enhancement.
>>>>
>>>>
>>>>
>>>> Cheers,
>>>> Steve
>>>>
>>>>
>>>>
>>>> *From:* Std-Proposals <std-proposals-bounces_at_[hidden]> *On
>>>> Behalf Of *Muneem via Std-Proposals
>>>> *Sent:* Friday, April 3, 2026 5:03 PM
>>>> *To:* std-proposals_at_[hidden]
>>>> *Cc:* Muneem <itfllow123_at_[hidden]>
>>>> *Subject:* Re: [std-proposals] Fwd: Extension to runtime polymorphism
>>>> proposed
>>>>
>>>>
>>>>
>>>> >context for readers:second one was a class of problems fixed by
>>>> heterogeneous lists.
>>>>
>>>> It's the second one, and I am really really sorry for the confusion.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Sat, 4 Apr 2026, 3:58 am Steve Weinrich via Std-Proposals, <
>>>> std-proposals_at_[hidden]> wrote:
>>>>
>>>> If I may, a heterogeneous list is not a problem, it is a solution to a
>>>> problem. So there are now two possibilities:
>>>>
>>>> 1. You are trying to create a new std container to solve a class of
>>>> problems.
>>>> 2. You have some other problem that you have used a heterogenous
>>>> list to solve.
>>>>
>>>>
>>>>
>>>> What say you?
>>>>
>>>>
>>>>
>>>> Cheers,
>>>> Steve
>>>>
>>>>
>>>>
>>>> *From:* Std-Proposals <std-proposals-bounces_at_[hidden]> *On
>>>> Behalf Of *Muneem via Std-Proposals
>>>> *Sent:* Friday, April 3, 2026 4:54 PM
>>>> *To:* std-proposals_at_[hidden]
>>>> *Cc:* Muneem <itfllow123_at_[hidden]>
>>>> *Subject:* Re: [std-proposals] Fwd: Extension to runtime polymorphism
>>>> proposed
>>>>
>>>>
>>>>
>>>> Sorry, I meant "heterogeneous lists is the problem", so really really
>>>> sorry for that mistake, It was an auto correct mistake.
>>>>
>>>>
>>>>
>>>> On Sat, 4 Apr 2026, 3:52 am Muneem, <itfllow123_at_[hidden]> wrote:
>>>>
>>>> Heterogeneous problems is the problem that leads to the code bloat or
>>>> verbosity that I described, but it's hard to chose weather chicken came
>>>> first or the egg in this regard.
>>>>
>>>>
>>>>
>>>> Regards, Muneem
>>>>
>>>>
>>>>
>>>> On Sat, 4 Apr 2026, 3:50 am Steve Weinrich via Std-Proposals, <
>>>> std-proposals_at_[hidden]> wrote:
>>>>
>>>> Hi Muneem,
>>>>
>>>>
>>>>
>>>> If you don’t mind, I would like to limit this portion of our
>>>> interaction to simply describing the problem. I want to get a 100%
>>>> understanding of the problem, unfettered by any assumptions (including C++
>>>> limitations) or previous solutions. Is a heterogenous list actually the
>>>> problem or is that simply a solution that you think fits the problem at
>>>> hand?
>>>>
>>>>
>>>>
>>>> Cheers,
>>>> Steve
>>>>
>>>>
>>>>
>>>> *From:* Std-Proposals <std-proposals-bounces_at_[hidden]> *On
>>>> Behalf Of *Muneem via Std-Proposals
>>>> *Sent:* Friday, April 3, 2026 4:40 PM
>>>> *To:* std-proposals_at_[hidden]
>>>> *Cc:* Muneem <itfllow123_at_[hidden]>
>>>> *Subject:* Re: [std-proposals] Fwd: Extension to runtime polymorphism
>>>> proposed
>>>>
>>>>
>>>>
>>>> Hi!
>>>>
>>>> Thanks for your response!!!
>>>>
>>>>
>>>>
>>>> Yes, your understanding of the problem is mostly correct with a few
>>>> details left out. The alternative to using classes would be switch
>>>> statements that as discussed would lead to code bloat and might backfire.
>>>> The goal is to make a construct that makes the indexing into heterogenous
>>>> lists more easy. There is how ever one thing that I would like to clarify:
>>>>
>>>> The goal isn't the classic polymorphism but to be able to return an
>>>> object of any type through a specified interface(indexing heterogenous
>>>> lists), which your example dosent do. Basically, just like GO and many
>>>> other compiled languages support heterogenous lists, I want c++ to do so as
>>>> well. Say you want index a bunch of containers(to use top()
>>>> function),std::visit fails because it can't return any return type that you
>>>> would wish for, so you can either use switch case to write expression top
>>>> for each container or this long verbose technique:
>>>>
>>>> #include<vector>
>>>>
>>>> #include<deque>
>>>>
>>>> template<typename T>
>>>>
>>>> struct Base{
>>>>
>>>> virtual T top_wrapper();
>>>>
>>>> };
>>>>
>>>>
>>>>
>>>> template<typename T>
>>>>
>>>> struct Derived_1: Base<T>, std::vector<T>{
>>>>
>>>> T top_wrapper() override{
>>>>
>>>> return T{*this.top()};//T{} is just to show that it works
>>>> even if top had some other return type
>>>>
>>>> }
>>>>
>>>> };
>>>>
>>>> template<typename T>
>>>>
>>>> struct Derived_2: Base<T>, std:: deque<T>{
>>>>
>>>> T top_wrapper() override{
>>>>
>>>> return T{*this.top()};//T{} is just to show that it works
>>>> even if top had some other return type
>>>>
>>>> }
>>>>
>>>> };
>>>>
>>>> int main(){
>>>>
>>>> std::vector<Base<int>> a;
>>>>
>>>> //The compiler would probably optimize this example, but not an example
>>>> where you index the vector using real time input
>>>>
>>>> return 0;
>>>>
>>>> }
>>>>
>>>> //An vector of std::variant only works if I am willing to write a
>>>> helper top(std::variant<Args...> obj) function that includes std::visit to
>>>> call top(), that in of itself is not only verbose but obscures intent and
>>>> context.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Regards, Muneem.
>>>>
>>>>
>>>>
>>>> On Sat, 4 Apr 2026, 2:43 am Steve Weinrich via Std-Proposals, <
>>>> std-proposals_at_[hidden]> wrote:
>>>>
>>>> Hi Muneem,
>>>>
>>>>
>>>>
>>>> I would like to make sure that I understand this problem before going
>>>> on.
>>>>
>>>>
>>>>
>>>> I think there are several classes that have a portion (or all) of their
>>>> interface in common. Each method of the interface returns a constant value:
>>>>
>>>>
>>>>
>>>> class First
>>>>
>>>> {
>>>>
>>>> public:
>>>>
>>>> int funcA () const { return 1123; }
>>>>
>>>> int funcB () const ( return 1234; }
>>>>
>>>> int funcC () const { return 1456; }
>>>>
>>>> };
>>>>
>>>>
>>>>
>>>> class Second
>>>>
>>>> {
>>>>
>>>> public:
>>>>
>>>> int funcA () const { return 2123; }
>>>>
>>>> int funcB () const ( return 2234; }
>>>>
>>>> int funcC () const { return 2456; }
>>>>
>>>> };
>>>>
>>>>
>>>>
>>>> class Third
>>>>
>>>> {
>>>>
>>>> public:
>>>>
>>>> int funcA () const { return 3123; }
>>>>
>>>> int funcB () const ( return 3234; }
>>>>
>>>> int funcC () const { return 3456; }
>>>>
>>>> };
>>>>
>>>>
>>>>
>>>> 1. We would like a means to be able to add more classes easily.
>>>> 2. We would like a means to be able to add to the shared interface
>>>> easily.
>>>> 3. We would like to be able to use the shared interface in a
>>>> polymorphic way (like a virtual method).
>>>> 4. Performance is of the utmost importance.
>>>>
>>>>
>>>>
>>>> Is my understanding correct?
>>>>
>>>>
>>>>
>>>> Steve
>>>>
>>>> *From:* Std-Proposals <std-proposals-bounces_at_[hidden]> *On
>>>> Behalf Of *Muneem via Std-Proposals
>>>> *Sent:* Friday, April 3, 2026 1:54 PM
>>>> *To:* std-proposals_at_[hidden]
>>>> *Cc:* Muneem <itfllow123_at_[hidden]>
>>>> *Subject:* Re: [std-proposals] Fwd: Extension to runtime polymorphism
>>>> proposed
>>>>
>>>>
>>>>
>>>> Sorry for sending two emails at once!
>>>>
>>>> I just wanted to revise the fact that the point of the whole proposal
>>>> is to provide intent, the code that Mr. Maciera was kind enough to bring
>>>> forward proves my exact point, that with enough intent, the compiler can
>>>> optimize anythjng, and these optimizations grow larger as the scale of the
>>>> program grows larger. Microbenchmarks might show a single example but even
>>>> that single example should get us thinking that why is it so slow for this
>>>> one example? Does this overhead force people to write switch case
>>>> statements that can lead to code bloat which can again backfire in terms of
>>>> performance?
>>>>
>>>> Regards, Muneem.
>>>>
>>>>
>>>>
>>>> On Sat, 4 Apr 2026, 12:48 am Muneem, <itfllow123_at_[hidden]> wrote:
>>>>
>>>> Hi!
>>>>
>>>> Thanks again for your feedback, Macieira. 👍
>>>>
>>>> >micro benchmark is misleading
>>>>
>>>> 1. The reason that I gave you microbenchmarks is that some asked for
>>>> it, and even I was too relectunt to use them despite the quote of Bjarne
>>>> Stroustrups
>>>>
>>>> "Don't assume, measure" because in this case, the goal is to either
>>>> make the compiler smaller or runtime faster, both of which are targeted by
>>>> my new proposal.
>>>>
>>>> 2. You are right that the compiler might have folded the loop into
>>>> half, but the point is that it still shows that the observable behaviour
>>>> is the same, infact, if the loop body was to index into a heterogeneous
>>>> set(using the proposed construct) and do some operation then the compiler
>>>> would optimize the indexing if the source of the index is one. This proves
>>>> that intent. An help the compiler do wonders:
>>>>
>>>> 1.Fold loops even when I used volatile to avoid it.
>>>>
>>>> 2.Avoid the entire indexing operations (if in a loop with the most
>>>> minimal compile time overhead)
>>>>
>>>> 3. Store the result immediately after it takes input into some memory
>>>> location (if that solution is the fasted).
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> 3.Optimize a single expression for the sake of the whole program.
>>>>
>>>> Currently, the optimizer might in fact be able to optimize checks in a
>>>> loop, but it's not as easy or as gurrentied because there are no semantical
>>>> promises that we can make with the existing constructs to make it happen.
>>>>
>>>> 4.My main point isn't weather my benchmark is correct or wrong, but
>>>> rather that expressing intent is better. The bench mark was merely to show
>>>> that std::visit is slower (according to g++ and Microsoft visual studio
>>>> 2026 compiled programs, using std::chorno and visual studio 2026 CPU usage
>>>> measurement tools to prove my point), but even if some compiler or all
>>>> compilers optimize their performance; we still have compile time overhead
>>>> for taking std::visit and making it faster, and the optimization might
>>>> backfire since it would be to optimize single statements independent of
>>>> what's in the rest of the program. Why? Because unlike my proposed
>>>> construct, std::visit does not have enough context and intent to tell the
>>>> compiler what's going on so that it can generate code that has the exact
>>>> "book keeping" data and access code that fits the entire program.
>>>>
>>>>
>>>>
>>>> 3. In case, someone's think a few nano seconds in a single example
>>>> isn't a big deal, then rethink it because if my construct is passed then
>>>> yes, it would not be a big deal because the compiler can optimize many
>>>> indexing operations into a single heterogenous set and maybe cache the
>>>> result afterwards somewhere. The issue is that this can't be done with the
>>>> current techniques because of the lack of intent. Compilers are much
>>>> smarter than we could ever be because they are work of many people's entire
>>>> career, not just one very smart guy from Intel, so blaming/restricting
>>>> compilers whose job is to be as general for the sake of the whole program.
>>>>
>>>> 4.>I suppose it decided to unroll the loop a >bit
>>>>
>>>> >and made two calls to sink() per loop:
>>>>
>>>> >template <typename T> void sink(const T >&) { asm volatile("" :::
>>>> "memory"); }
>>>>
>>>> Even if it optimized switch case statement using volatile("" :::
>>>> "memory"); but not std::visit
>>>>
>>>> That's my point isn't that switch case is magically faster, but rather
>>>> the compiler has more room to cheat and skip things. Infact the standard
>>>> allows it a lot of free room as long as the observable behaviour is the
>>>> same, even more so by giving it free room with sets of observable
>>>> behaviours (unspecified behaviours)
>>>>
>>>> 5. Microbe marking wasent to show that std::visit is inherintly
>>>> slower, but rather the compiler can and should do mistakes in optimizing
>>>> it, in order to avoid massive compile time overhead.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, 3 Apr 2026, 8:33 pm Thiago Macieira via Std-Proposals, <
>>>> std-proposals_at_[hidden]> wrote:
>>>>
>>>> On Thursday, 2 April 2026 19:15:42 Pacific Daylight Time Thiago
>>>> Macieira via
>>>> Std-Proposals wrote:
>>>> > Even in this case, I have profiled the code above (after fixing it and
>>>> > removing the std::cout itself) and found that overall, the switched
>>>> case
>>>> > ran 2x faster, at 0.113 ns per iteration, while the variant case
>>>> required
>>>> > 0.227 ns per iteration. Looking at the CPU performance counters, the
>>>> > std::variant code has 2 branches per iteration and takes 1 cycle per
>>>> > iteration, running at 5 IPC (thus, 5 instructions per iteration).
>>>> > Meanwhile, the switched case has 0.5 branch per iteration and takes
>>>> 0.5
>>>> > cycle per iteration, running at 2 IPC. The half cycle numbers make
>>>> sense
>>>> > because I believe the two instructions are getting macrofused
>>>> together and
>>>> > execute as a single uop, which causes confusing numbers.
>>>>
>>>> This half a cycle and ninth of a nanosecond problem has been on my mind
>>>> for a
>>>> while. The execution time of anything needs to be a multiple of the
>>>> cycle
>>>> time, so a CPU running at 4.5 GHz line mine was shouldn't have a
>>>> difference of
>>>> one ninth of a nanosecond. One explanation would be that somehow the
>>>> CPU was
>>>> executing two iterations of the loop at the same time, pipelining.
>>>>
>>>> But disassembling the binary shows a simpler explanation. The switch
>>>> loop was:
>>>>
>>>> 40149f: mov $0x3b9aca00,%eax
>>>> 4014a4: nop
>>>> 4014a5: data16 cs nopw 0x0(%rax,%rax,1)
>>>> 4014b0: sub $0x2,%eax
>>>> 4014b3: jne 4014b0
>>>>
>>>> [Note how there is no test for what was being indexed in the loop!]
>>>>
>>>> Here's what I had missed: sub $2. I'm not entirely certain what GCC was
>>>> thinking here, but it's subtracting 2 instead of 1, so this looped half
>>>> a
>>>> billion times (0x3b9aca00 / 2). I suppose it decided to unroll the loop
>>>> a bit
>>>> and made two calls to sink() per loop:
>>>>
>>>> template <typename T> void sink(const T &) { asm volatile("" :::
>>>> "memory"); }
>>>>
>>>> But that expanded to nothing in the output. I could add "nop" so we'd
>>>> see what
>>>> happened and the CPU would be obligated to retire those instructions,
>>>> increasing the instruction executed counter (I can't quickly find how
>>>> many the
>>>> TGL processor / WLC core can retire per cycle, but I recall it's 6, so
>>>> adding
>>>> 2 more instructions shouldn't affect the execution time). But I don't
>>>> think I
>>>> need to further benchmark this to prove my point:
>>>>
>>>> The microbenchmark is misleading.
>>>>
>>>> --
>>>> Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
>>>> Principal Engineer - Intel Data Center - Platform & Sys. Eng.
>>>> --
>>>> Std-Proposals mailing list
>>>> Std-Proposals_at_[hidden]
>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>
>>>> --
>>>> Std-Proposals mailing list
>>>> Std-Proposals_at_[hidden]
>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>
>>>> --
>>>> Std-Proposals mailing list
>>>> Std-Proposals_at_[hidden]
>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>
>>>> --
>>>> Std-Proposals mailing list
>>>> Std-Proposals_at_[hidden]
>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>
>>>> --
>>>> Std-Proposals mailing list
>>>> Std-Proposals_at_[hidden]
>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>
>>>> --
>>>> Std-Proposals mailing list
>>>> Std-Proposals_at_[hidden]
>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>
>>>> --
>>>> Std-Proposals mailing list
>>>> Std-Proposals_at_[hidden]
>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>
>>> --
>>> Std-Proposals mailing list
>>> Std-Proposals_at_[hidden]
>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>
>> --
>> Std-Proposals mailing list
>> Std-Proposals_at_[hidden]
>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>
>
it does seem like I may be incapable of putting it into words. Like please
do a counter proposal, I would really appreciate it.
Regards, Muneem.
On Sat, 4 Apr 2026, 6:43 am Muneem, <itfllow123_at_[hidden]> wrote:
> I am really really sorry if my response was no satisfactory.
> In short, all I want is to solve the problem where we can't tell the
> compiler to "render an object of any type in place at the intermediate code
> level". This problem is a problem because we have to do it our selves
> unlike in languages like GO. One this problem is solved then heterogenous
> lists would be possible.
> Regards, Muneem
>
> On Sat, 4 Apr 2026, 6:24 am Steve Weinrich via Std-Proposals, <
> std-proposals_at_[hidden]> wrote:
>
>> Hi Maneem,
>>
>> I had hoped by asking you to explain the problem you are trying to solve,
>> I might be able to help you describe things in a way that would make it
>> easier for you to describe the exact language feature you are proposing.
>>
>> Either am not smart enough to understand you (a good probability) or you
>> are incapable of describing what you seek in C++ language terms. Keep in
>> mind that the committee only deals on the language!
>>
>> I wish you all the best.
>>
>> Cheers,
>> Steve
>>
>> On Fri, Apr 3, 2026, 19:09 Muneem via Std-Proposals <
>> std-proposals_at_[hidden]> wrote:
>>
>>> Thank you for your response!
>>> what is the problem(the question you asked)?
>>> There are many possible answers to what your question might have
>>> meant(hope any of these answer you):
>>>
>>> 1.This problem before c++ existed and before it had this static type
>>> system could be quantified as branching overhead and verbosity, and why we
>>> needed better techniques that avoided it.
>>> This wilkipedia article explains the first time that branching was a
>>> problem(all credits to wilkipedia:
>>> https://en.wikipedia.org/wiki/Branch_table):
>>> Use of branch tables and other raw data encoding was common in the early
>>> days of computing when memory was expensive, CPUs were slower and compact
>>> data representation and efficient choice of alternatives were important.
>>> This fact is truer now more than ever. In particular the issue in
>>> branching: pointer redirection using any implementation like vtables or
>>> function pointers. Sometimes, branching may infact be faster, but again,
>>> constructs that give no context and intent to the compiler are hard for the
>>> compiler to decide what to do with. This is why c++ came and provides if
>>> statements, switch statements, ternary statements, and many more so that it
>>> can provide the best intermediate code representation possible. Each type
>>> of branch statements isn't just a new syntax but it makes a user write a
>>> certain way, and let's the compiler do optimizations before the compiler
>>> backend reads the code.
>>>
>>> 2. The problem is the lack of a construct for describing a language
>>> level construct for type erasure that can result in optimized intermediate
>>> representation of the code.
>>>
>>> 3.The problem is virtual functions don't do it, switch case statements
>>> don't do it, nothing does, manual type erasure code dosent so it. The "it"
>>> is type erasure patterns. Switch case (and others) statements fails
>>> completely if you don't write code for each object again and again.
>>>
>>> 4. The problem is verbosity of current branching/polymorphism techniques
>>> for type erasure. Not only that but you can't even overload a polymorphic
>>> function to return a different type based on an argument(unless the return
>>> type is a polymorphic class(known as "Return Type Relaxation" in C++ 4th by
>>> Bajrne Stroustrup" section 20.3.6)) in order to fix this problem by the
>>> visitor pattern or some other double dispatch pattern.
>>>
>>> 5. The problem is lack of clear expression of type erasure.
>>>
>>>
>>> 2. I don't want make an heterogeneous list in the traditional sense, but
>>> rather a list of const references of any type, so it isnt against c++
>>> philosophies, it's just trying to automate the process of manual type
>>> erasure and leave it to the compiler to produce optimal intermediate code
>>> representation based on the specific program, context of every
>>> subscripting, and the source of every subscripting operation. That is as
>>> c++ as one can get. It would not be c++ if I were to just ask for
>>> heterogeneous list that is completely up to the implementation, but rather
>>> I want type erasure for const lvalue references at a language
>>> level(optimized intermediate code representation). Think of it like
>>> templates, if it helps making it easy to reason. In fact, this is a form of
>>> "Return Type Relaxation"(in C++ 4th by Bajrne Stroustrup" section 20.3.6),
>>> but instead of pointers and references, but I want only want to use const
>>> references.
>>>
>>>
>>>
>>> On Sat, 4 Apr 2026, 5:22 am Steve Weinrich via Std-Proposals, <
>>> std-proposals_at_[hidden]> wrote:
>>>
>>>> Hi Muneem,
>>>>
>>>>
>>>>
>>>> I am not trying to be difficult, but the “problems” that you have
>>>> described are {potentially} the result of implementation choices and/or C++
>>>> limitations. What I am currently interested in is the problem that was
>>>> presented before making those choices.
>>>>
>>>>
>>>>
>>>> Let me see if I can give an example. Someone says, I have a problem.
>>>> Every time I insert on object into std::vector, I have to re-sort the
>>>> vector. Obviously, they are using the wrong container. *But even if
>>>> they switch to a std::map, we don’t know if the data truly needs to be
>>>> sorted. Or how frequently it needs to be sorted.*
>>>>
>>>>
>>>>
>>>> You say, “You want to choose between three container objects but they
>>>> don't have different type.”
>>>>
>>>>
>>>>
>>>> That could look like:
>>>>
>>>>
>>>>
>>>> std::vector<int> alpha;
>>>>
>>>> std::vector<int> beta;
>>>>
>>>> std::vector<int> gamma;
>>>>
>>>>
>>>>
>>>> This immediately raises the question, “Why three vectors?”
>>>>
>>>>
>>>>
>>>> You say, “You want to choose between any three objects but they have
>>>> different types.”
>>>>
>>>>
>>>>
>>>> That could look like:
>>>>
>>>>
>>>>
>>>> class Alpha
>>>>
>>>> {
>>>>
>>>> public:
>>>>
>>>> int funcA ();
>>>>
>>>>
>>>>
>>>> private:
>>>>
>>>> // Some Alpha specific data
>>>>
>>>> };
>>>>
>>>>
>>>>
>>>> class Beta
>>>>
>>>> {
>>>>
>>>> public:
>>>>
>>>> int funcA ();
>>>>
>>>>
>>>>
>>>> private:
>>>>
>>>> // Some Beta specific data
>>>>
>>>> };
>>>>
>>>>
>>>>
>>>> class Gamma
>>>>
>>>> {
>>>>
>>>> public:
>>>>
>>>> int funcA ();
>>>>
>>>>
>>>>
>>>> private:
>>>>
>>>> // Some Gamma specific data
>>>>
>>>> };
>>>>
>>>>
>>>>
>>>> That immediately raises the question, “Will virtual functions work?”
>>>>
>>>>
>>>>
>>>> So I ask again, what is the problem (not the solution)?
>>>>
>>>>
>>>>
>>>> Cheers,
>>>> Steve
>>>>
>>>> *From:* Std-Proposals <std-proposals-bounces_at_[hidden]> *On
>>>> Behalf Of *Muneem via Std-Proposals
>>>> *Sent:* Friday, April 3, 2026 6:03 PM
>>>> *To:* std-proposals_at_[hidden]
>>>> *Cc:* Muneem <itfllow123_at_[hidden]>
>>>> *Subject:* Re: [std-proposals] Fwd: Extension to runtime polymorphism
>>>> proposed
>>>>
>>>>
>>>>
>>>> An extended reason to why this class of problems described in the
>>>> previous emails exists is because the current constructs dosent give enough
>>>> context to intermediate code generation, and solely rely on llvms or other
>>>> backends to be advanced enough, which for languages don't make sense
>>>> because language are meant to be less verbose and more explicit than a
>>>> compiler backend code generation tools.
>>>>
>>>>
>>>>
>>>> (Really Sorry for sending two emails at one)
>>>>
>>>>
>>>>
>>>> Regards, Muneem.
>>>>
>>>>
>>>>
>>>> On Sat, 4 Apr 2026, 4:54 am Muneem, <itfllow123_at_[hidden]> wrote:
>>>>
>>>> The actual class of problems is code repitition and obscurement of
>>>> intent:
>>>>
>>>> 1.Say you want to choose between three container objects but they don't
>>>> have different type.
>>>>
>>>> 2. Say you want to choose between any three objects but they have
>>>> different types.
>>>>
>>>>
>>>>
>>>> The current fix for both, to allocate those objects in a
>>>> std::vector/array of std::variant element type or to use switch statements.
>>>> All of these obscure intent making it hard to optimize code in intermediate
>>>> code generation. This is actually more common than we think, infact, we can
>>>> destroy 99% of germs(branch statements) with function objects that can be
>>>> indexed.
>>>>
>>>>
>>>>
>>>> In short the class of problems is using too many branch statements for
>>>> everything that can't be indexed by current containers.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Sat, 4 Apr 2026, 4:24 am Steve Weinrich via Std-Proposals, <
>>>> std-proposals_at_[hidden]> wrote:
>>>>
>>>> Hi Muneem,
>>>>
>>>>
>>>>
>>>> You mention “this class of problems” I would like to know what that
>>>> is? Please forget about a heterogenous list. What is the root problem
>>>> that the heterogenous list solves? Please describe an actual problem, not
>>>> a “it would be nice to solve.”
>>>>
>>>>
>>>>
>>>> To make this a little easier and to involve less people, here is my
>>>> email: weinrich.steve_at_[hidden]
>>>>
>>>>
>>>>
>>>> Cheers,
>>>> Steve
>>>>
>>>>
>>>>
>>>> *From:* Std-Proposals <std-proposals-bounces_at_[hidden]> *On
>>>> Behalf Of *Muneem via Std-Proposals
>>>> *Sent:* Friday, April 3, 2026 5:17 PM
>>>> *To:* std-proposals_at_[hidden]
>>>> *Cc:* Muneem <itfllow123_at_[hidden]>
>>>> *Subject:* Re: [std-proposals] Fwd: Extension to runtime polymorphism
>>>> proposed
>>>>
>>>>
>>>>
>>>> Thanks for your interested ❤️❤️❤️
>>>>
>>>> If you believe that any other solution maybe a better option for this
>>>> class of problems then please let me know. In fact, we would collaborate on
>>>> the proposal. This is not at odds with c++ because the (recommended)
>>>> semantics is a construct that captures values using const references, as
>>>> opposed to storing them directly as a struct would, and for the compiler to
>>>> access the value by reference, inline code for each branch, or do what it
>>>> thinks it best. It is similiar to the heterogeneous lists provided by GO
>>>> but it isn't because it captures by const value reference, and references
>>>> aren't the same as non const pointers, in the that they can be optimized
>>>> like const pointer can't be. Sorry for writing too much, I just got too
>>>> excited by someone taking openly interest in this proposal ❤️❤️❤️❤️.
>>>>
>>>>
>>>>
>>>> Regards, Muneem
>>>>
>>>>
>>>>
>>>> On Sat, 4 Apr 2026, 4:11 am Steve Weinrich via Std-Proposals, <
>>>> std-proposals_at_[hidden]> wrote:
>>>>
>>>> Hi Muneem,
>>>>
>>>>
>>>>
>>>> Thanks. I am interested in the original problem. I am curious if a
>>>> heterogeneous list is the optimal solution for the problem at hand. Over
>>>> the last 50+ years of programming, I have encountered many questions like
>>>> yours, which presume a particular solution. Sometimes that solution is at
>>>> odds with the language at hand (or other issues). I find that
>>>> understanding the original problem allows me to better understand why one
>>>> is suggesting a particular language enhancement.
>>>>
>>>>
>>>>
>>>> Cheers,
>>>> Steve
>>>>
>>>>
>>>>
>>>> *From:* Std-Proposals <std-proposals-bounces_at_[hidden]> *On
>>>> Behalf Of *Muneem via Std-Proposals
>>>> *Sent:* Friday, April 3, 2026 5:03 PM
>>>> *To:* std-proposals_at_[hidden]
>>>> *Cc:* Muneem <itfllow123_at_[hidden]>
>>>> *Subject:* Re: [std-proposals] Fwd: Extension to runtime polymorphism
>>>> proposed
>>>>
>>>>
>>>>
>>>> >context for readers:second one was a class of problems fixed by
>>>> heterogeneous lists.
>>>>
>>>> It's the second one, and I am really really sorry for the confusion.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Sat, 4 Apr 2026, 3:58 am Steve Weinrich via Std-Proposals, <
>>>> std-proposals_at_[hidden]> wrote:
>>>>
>>>> If I may, a heterogeneous list is not a problem, it is a solution to a
>>>> problem. So there are now two possibilities:
>>>>
>>>> 1. You are trying to create a new std container to solve a class of
>>>> problems.
>>>> 2. You have some other problem that you have used a heterogenous
>>>> list to solve.
>>>>
>>>>
>>>>
>>>> What say you?
>>>>
>>>>
>>>>
>>>> Cheers,
>>>> Steve
>>>>
>>>>
>>>>
>>>> *From:* Std-Proposals <std-proposals-bounces_at_[hidden]> *On
>>>> Behalf Of *Muneem via Std-Proposals
>>>> *Sent:* Friday, April 3, 2026 4:54 PM
>>>> *To:* std-proposals_at_[hidden]
>>>> *Cc:* Muneem <itfllow123_at_[hidden]>
>>>> *Subject:* Re: [std-proposals] Fwd: Extension to runtime polymorphism
>>>> proposed
>>>>
>>>>
>>>>
>>>> Sorry, I meant "heterogeneous lists is the problem", so really really
>>>> sorry for that mistake, It was an auto correct mistake.
>>>>
>>>>
>>>>
>>>> On Sat, 4 Apr 2026, 3:52 am Muneem, <itfllow123_at_[hidden]> wrote:
>>>>
>>>> Heterogeneous problems is the problem that leads to the code bloat or
>>>> verbosity that I described, but it's hard to chose weather chicken came
>>>> first or the egg in this regard.
>>>>
>>>>
>>>>
>>>> Regards, Muneem
>>>>
>>>>
>>>>
>>>> On Sat, 4 Apr 2026, 3:50 am Steve Weinrich via Std-Proposals, <
>>>> std-proposals_at_[hidden]> wrote:
>>>>
>>>> Hi Muneem,
>>>>
>>>>
>>>>
>>>> If you don’t mind, I would like to limit this portion of our
>>>> interaction to simply describing the problem. I want to get a 100%
>>>> understanding of the problem, unfettered by any assumptions (including C++
>>>> limitations) or previous solutions. Is a heterogenous list actually the
>>>> problem or is that simply a solution that you think fits the problem at
>>>> hand?
>>>>
>>>>
>>>>
>>>> Cheers,
>>>> Steve
>>>>
>>>>
>>>>
>>>> *From:* Std-Proposals <std-proposals-bounces_at_[hidden]> *On
>>>> Behalf Of *Muneem via Std-Proposals
>>>> *Sent:* Friday, April 3, 2026 4:40 PM
>>>> *To:* std-proposals_at_[hidden]
>>>> *Cc:* Muneem <itfllow123_at_[hidden]>
>>>> *Subject:* Re: [std-proposals] Fwd: Extension to runtime polymorphism
>>>> proposed
>>>>
>>>>
>>>>
>>>> Hi!
>>>>
>>>> Thanks for your response!!!
>>>>
>>>>
>>>>
>>>> Yes, your understanding of the problem is mostly correct with a few
>>>> details left out. The alternative to using classes would be switch
>>>> statements that as discussed would lead to code bloat and might backfire.
>>>> The goal is to make a construct that makes the indexing into heterogenous
>>>> lists more easy. There is how ever one thing that I would like to clarify:
>>>>
>>>> The goal isn't the classic polymorphism but to be able to return an
>>>> object of any type through a specified interface(indexing heterogenous
>>>> lists), which your example dosent do. Basically, just like GO and many
>>>> other compiled languages support heterogenous lists, I want c++ to do so as
>>>> well. Say you want index a bunch of containers(to use top()
>>>> function),std::visit fails because it can't return any return type that you
>>>> would wish for, so you can either use switch case to write expression top
>>>> for each container or this long verbose technique:
>>>>
>>>> #include<vector>
>>>>
>>>> #include<deque>
>>>>
>>>> template<typename T>
>>>>
>>>> struct Base{
>>>>
>>>> virtual T top_wrapper();
>>>>
>>>> };
>>>>
>>>>
>>>>
>>>> template<typename T>
>>>>
>>>> struct Derived_1: Base<T>, std::vector<T>{
>>>>
>>>> T top_wrapper() override{
>>>>
>>>> return T{*this.top()};//T{} is just to show that it works
>>>> even if top had some other return type
>>>>
>>>> }
>>>>
>>>> };
>>>>
>>>> template<typename T>
>>>>
>>>> struct Derived_2: Base<T>, std:: deque<T>{
>>>>
>>>> T top_wrapper() override{
>>>>
>>>> return T{*this.top()};//T{} is just to show that it works
>>>> even if top had some other return type
>>>>
>>>> }
>>>>
>>>> };
>>>>
>>>> int main(){
>>>>
>>>> std::vector<Base<int>> a;
>>>>
>>>> //The compiler would probably optimize this example, but not an example
>>>> where you index the vector using real time input
>>>>
>>>> return 0;
>>>>
>>>> }
>>>>
>>>> //An vector of std::variant only works if I am willing to write a
>>>> helper top(std::variant<Args...> obj) function that includes std::visit to
>>>> call top(), that in of itself is not only verbose but obscures intent and
>>>> context.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Regards, Muneem.
>>>>
>>>>
>>>>
>>>> On Sat, 4 Apr 2026, 2:43 am Steve Weinrich via Std-Proposals, <
>>>> std-proposals_at_[hidden]> wrote:
>>>>
>>>> Hi Muneem,
>>>>
>>>>
>>>>
>>>> I would like to make sure that I understand this problem before going
>>>> on.
>>>>
>>>>
>>>>
>>>> I think there are several classes that have a portion (or all) of their
>>>> interface in common. Each method of the interface returns a constant value:
>>>>
>>>>
>>>>
>>>> class First
>>>>
>>>> {
>>>>
>>>> public:
>>>>
>>>> int funcA () const { return 1123; }
>>>>
>>>> int funcB () const ( return 1234; }
>>>>
>>>> int funcC () const { return 1456; }
>>>>
>>>> };
>>>>
>>>>
>>>>
>>>> class Second
>>>>
>>>> {
>>>>
>>>> public:
>>>>
>>>> int funcA () const { return 2123; }
>>>>
>>>> int funcB () const ( return 2234; }
>>>>
>>>> int funcC () const { return 2456; }
>>>>
>>>> };
>>>>
>>>>
>>>>
>>>> class Third
>>>>
>>>> {
>>>>
>>>> public:
>>>>
>>>> int funcA () const { return 3123; }
>>>>
>>>> int funcB () const ( return 3234; }
>>>>
>>>> int funcC () const { return 3456; }
>>>>
>>>> };
>>>>
>>>>
>>>>
>>>> 1. We would like a means to be able to add more classes easily.
>>>> 2. We would like a means to be able to add to the shared interface
>>>> easily.
>>>> 3. We would like to be able to use the shared interface in a
>>>> polymorphic way (like a virtual method).
>>>> 4. Performance is of the utmost importance.
>>>>
>>>>
>>>>
>>>> Is my understanding correct?
>>>>
>>>>
>>>>
>>>> Steve
>>>>
>>>> *From:* Std-Proposals <std-proposals-bounces_at_[hidden]> *On
>>>> Behalf Of *Muneem via Std-Proposals
>>>> *Sent:* Friday, April 3, 2026 1:54 PM
>>>> *To:* std-proposals_at_[hidden]
>>>> *Cc:* Muneem <itfllow123_at_[hidden]>
>>>> *Subject:* Re: [std-proposals] Fwd: Extension to runtime polymorphism
>>>> proposed
>>>>
>>>>
>>>>
>>>> Sorry for sending two emails at once!
>>>>
>>>> I just wanted to revise the fact that the point of the whole proposal
>>>> is to provide intent, the code that Mr. Maciera was kind enough to bring
>>>> forward proves my exact point, that with enough intent, the compiler can
>>>> optimize anythjng, and these optimizations grow larger as the scale of the
>>>> program grows larger. Microbenchmarks might show a single example but even
>>>> that single example should get us thinking that why is it so slow for this
>>>> one example? Does this overhead force people to write switch case
>>>> statements that can lead to code bloat which can again backfire in terms of
>>>> performance?
>>>>
>>>> Regards, Muneem.
>>>>
>>>>
>>>>
>>>> On Sat, 4 Apr 2026, 12:48 am Muneem, <itfllow123_at_[hidden]> wrote:
>>>>
>>>> Hi!
>>>>
>>>> Thanks again for your feedback, Macieira. 👍
>>>>
>>>> >micro benchmark is misleading
>>>>
>>>> 1. The reason that I gave you microbenchmarks is that some asked for
>>>> it, and even I was too relectunt to use them despite the quote of Bjarne
>>>> Stroustrups
>>>>
>>>> "Don't assume, measure" because in this case, the goal is to either
>>>> make the compiler smaller or runtime faster, both of which are targeted by
>>>> my new proposal.
>>>>
>>>> 2. You are right that the compiler might have folded the loop into
>>>> half, but the point is that it still shows that the observable behaviour
>>>> is the same, infact, if the loop body was to index into a heterogeneous
>>>> set(using the proposed construct) and do some operation then the compiler
>>>> would optimize the indexing if the source of the index is one. This proves
>>>> that intent. An help the compiler do wonders:
>>>>
>>>> 1.Fold loops even when I used volatile to avoid it.
>>>>
>>>> 2.Avoid the entire indexing operations (if in a loop with the most
>>>> minimal compile time overhead)
>>>>
>>>> 3. Store the result immediately after it takes input into some memory
>>>> location (if that solution is the fasted).
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> 3.Optimize a single expression for the sake of the whole program.
>>>>
>>>> Currently, the optimizer might in fact be able to optimize checks in a
>>>> loop, but it's not as easy or as gurrentied because there are no semantical
>>>> promises that we can make with the existing constructs to make it happen.
>>>>
>>>> 4.My main point isn't weather my benchmark is correct or wrong, but
>>>> rather that expressing intent is better. The bench mark was merely to show
>>>> that std::visit is slower (according to g++ and Microsoft visual studio
>>>> 2026 compiled programs, using std::chorno and visual studio 2026 CPU usage
>>>> measurement tools to prove my point), but even if some compiler or all
>>>> compilers optimize their performance; we still have compile time overhead
>>>> for taking std::visit and making it faster, and the optimization might
>>>> backfire since it would be to optimize single statements independent of
>>>> what's in the rest of the program. Why? Because unlike my proposed
>>>> construct, std::visit does not have enough context and intent to tell the
>>>> compiler what's going on so that it can generate code that has the exact
>>>> "book keeping" data and access code that fits the entire program.
>>>>
>>>>
>>>>
>>>> 3. In case, someone's think a few nano seconds in a single example
>>>> isn't a big deal, then rethink it because if my construct is passed then
>>>> yes, it would not be a big deal because the compiler can optimize many
>>>> indexing operations into a single heterogenous set and maybe cache the
>>>> result afterwards somewhere. The issue is that this can't be done with the
>>>> current techniques because of the lack of intent. Compilers are much
>>>> smarter than we could ever be because they are work of many people's entire
>>>> career, not just one very smart guy from Intel, so blaming/restricting
>>>> compilers whose job is to be as general for the sake of the whole program.
>>>>
>>>> 4.>I suppose it decided to unroll the loop a >bit
>>>>
>>>> >and made two calls to sink() per loop:
>>>>
>>>> >template <typename T> void sink(const T >&) { asm volatile("" :::
>>>> "memory"); }
>>>>
>>>> Even if it optimized switch case statement using volatile("" :::
>>>> "memory"); but not std::visit
>>>>
>>>> That's my point isn't that switch case is magically faster, but rather
>>>> the compiler has more room to cheat and skip things. Infact the standard
>>>> allows it a lot of free room as long as the observable behaviour is the
>>>> same, even more so by giving it free room with sets of observable
>>>> behaviours (unspecified behaviours)
>>>>
>>>> 5. Microbe marking wasent to show that std::visit is inherintly
>>>> slower, but rather the compiler can and should do mistakes in optimizing
>>>> it, in order to avoid massive compile time overhead.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, 3 Apr 2026, 8:33 pm Thiago Macieira via Std-Proposals, <
>>>> std-proposals_at_[hidden]> wrote:
>>>>
>>>> On Thursday, 2 April 2026 19:15:42 Pacific Daylight Time Thiago
>>>> Macieira via
>>>> Std-Proposals wrote:
>>>> > Even in this case, I have profiled the code above (after fixing it and
>>>> > removing the std::cout itself) and found that overall, the switched
>>>> case
>>>> > ran 2x faster, at 0.113 ns per iteration, while the variant case
>>>> required
>>>> > 0.227 ns per iteration. Looking at the CPU performance counters, the
>>>> > std::variant code has 2 branches per iteration and takes 1 cycle per
>>>> > iteration, running at 5 IPC (thus, 5 instructions per iteration).
>>>> > Meanwhile, the switched case has 0.5 branch per iteration and takes
>>>> 0.5
>>>> > cycle per iteration, running at 2 IPC. The half cycle numbers make
>>>> sense
>>>> > because I believe the two instructions are getting macrofused
>>>> together and
>>>> > execute as a single uop, which causes confusing numbers.
>>>>
>>>> This half a cycle and ninth of a nanosecond problem has been on my mind
>>>> for a
>>>> while. The execution time of anything needs to be a multiple of the
>>>> cycle
>>>> time, so a CPU running at 4.5 GHz line mine was shouldn't have a
>>>> difference of
>>>> one ninth of a nanosecond. One explanation would be that somehow the
>>>> CPU was
>>>> executing two iterations of the loop at the same time, pipelining.
>>>>
>>>> But disassembling the binary shows a simpler explanation. The switch
>>>> loop was:
>>>>
>>>> 40149f: mov $0x3b9aca00,%eax
>>>> 4014a4: nop
>>>> 4014a5: data16 cs nopw 0x0(%rax,%rax,1)
>>>> 4014b0: sub $0x2,%eax
>>>> 4014b3: jne 4014b0
>>>>
>>>> [Note how there is no test for what was being indexed in the loop!]
>>>>
>>>> Here's what I had missed: sub $2. I'm not entirely certain what GCC was
>>>> thinking here, but it's subtracting 2 instead of 1, so this looped half
>>>> a
>>>> billion times (0x3b9aca00 / 2). I suppose it decided to unroll the loop
>>>> a bit
>>>> and made two calls to sink() per loop:
>>>>
>>>> template <typename T> void sink(const T &) { asm volatile("" :::
>>>> "memory"); }
>>>>
>>>> But that expanded to nothing in the output. I could add "nop" so we'd
>>>> see what
>>>> happened and the CPU would be obligated to retire those instructions,
>>>> increasing the instruction executed counter (I can't quickly find how
>>>> many the
>>>> TGL processor / WLC core can retire per cycle, but I recall it's 6, so
>>>> adding
>>>> 2 more instructions shouldn't affect the execution time). But I don't
>>>> think I
>>>> need to further benchmark this to prove my point:
>>>>
>>>> The microbenchmark is misleading.
>>>>
>>>> --
>>>> Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
>>>> Principal Engineer - Intel Data Center - Platform & Sys. Eng.
>>>> --
>>>> Std-Proposals mailing list
>>>> Std-Proposals_at_[hidden]
>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>
>>>> --
>>>> Std-Proposals mailing list
>>>> Std-Proposals_at_[hidden]
>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>
>>>> --
>>>> Std-Proposals mailing list
>>>> Std-Proposals_at_[hidden]
>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>
>>>> --
>>>> Std-Proposals mailing list
>>>> Std-Proposals_at_[hidden]
>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>
>>>> --
>>>> Std-Proposals mailing list
>>>> Std-Proposals_at_[hidden]
>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>
>>>> --
>>>> Std-Proposals mailing list
>>>> Std-Proposals_at_[hidden]
>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>
>>>> --
>>>> Std-Proposals mailing list
>>>> Std-Proposals_at_[hidden]
>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>
>>> --
>>> Std-Proposals mailing list
>>> Std-Proposals_at_[hidden]
>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>
>> --
>> Std-Proposals mailing list
>> Std-Proposals_at_[hidden]
>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>
>
Received on 2026-04-04 01:47:36
