Date: Fri, 3 Apr 2026 04:53:30 +0500
One last email (please forgive me for sending twice at once, but I think
that I didn't get your question so I wanted to answer it properly):
hope this is what you asked for:
// Current C++:
#include <variant>
#include <iostream>
#include<string>
struct A { int get() { return 1; } };
struct B { std::string get() { return "love C++";} };
struct C { double get() { return 2.1; } };
struct data_accessed_through_visit {
static std::variant<A, B, C> obj;
inline void operator()(int) {
std::visit([](auto&& arg) {
std::cout<< arg.get();
}, obj);
}
};
//proves that std visit is too verbose and slow (too hard/impractical to
optimize) for this task
what about switches, enjoy the look for yourself(while imagining the look
if this were repeated multiple times over hundreds of case statements):
struct data_switched {
inline void operator()(int index) {
switch(index) {
case 0: std::cout<<1;
case 1: std::cout<<"love C++";
case 2: std::cout<<2;
default: return -1;
}
}
};
fix:
struct data_indexed {
{ 1, "love C++", 2} return_indexed_value(std::size_t index);
//A last value can be added to express what is to be returned if the
index is out of range, that would be awesome because the compiler can even
optimize that check.
inline void operator()(int index) {
std::cout<<return_indexed_value(index);
}
}
sorry for sending two emails at twice, i just had to clarify myself one
last time.
On Fri, Apr 3, 2026 at 4:09 AM Muneem <itfllow123_at_[hidden]> wrote:
> I am really really sorry for sending a flood of emails. As for the
> example, it's in the end of this proposal file, please feel free to change
> it physical or by contributing to it on my GitHub. The reason that I
> thought sending a lot of emails is a good idea is because we have short
> attention spans so I wanted to instantly answer your questions while you
> still have it in your minds. If you want, we can calloborate on my GitHub
> on this feature or on your's (I am comfortable either way and on any time
> of the day (except when I am sleeping))
> The link and the file is:
>
> https://github.com/HjaldrKhilji/Future-potential-ISO-porposals/blob/main/Extension%20to%20runtime%20polymorphism.txt
> The example clearly shows that just like the spaceship operator, showing
> intent is always better and valuable. I have yet to define detailed
> features of this new syntax (ie, Const and volatile qualifications, or even
> potential thread policies for a single access), this is because I don't
> know what the comcencess is. Technically, this can be solved a billion
> ways, but right now the solution is minimal because I didn't convince
> anyone to propose a solution themselves.
> Again, sorry for sending too much emails, I thought that you guys use AI
> to summarize all my points into short concise ones before reading my raw
> emails. I use this techniques when talking to foreigners that use confusing
> proverbs. Again, I should've known better so I am really really sorry.
>
>
> On Fri, 3 Apr 2026, 3:52 am Jens Maurer, <jens.maurer_at_[hidden]> wrote:
>
>> Hi!
>>
>> May I suggest that you stop your habit of sending multiple
>> rapid-fire responses to the same e-mail?
>>
>> Take a breath, take a good night's sleep, and take another
>> breath before replying to an e-mail. Then, compose that
>> e-mail as a draft, take another night, and re-read it the
>> next day. Remember, there are orders of magnitude more
>> people you want to read your e-mail than there are who
>> writes it (just you, I'd guess), so you should make it
>> as easy as possible for those people to read your
>> utterances.
>>
>> Despite the flurry of e-mails, I have not yet seen a
>> simple example showing a use of the facility you wish to
>> have (it doesn't have to compile with current compilers),
>> and a comparison with the (most obvious) code I would need
>> to write without the facility you wish to have.
>>
>> Jens
>>
>>
>> On 4/2/26 21:20, Muneem via Std-Proposals wrote:
>> > std::array<int, 3> array_1 = { 1,2,3 }; is also global variable, but
>> the ternary/subcript operator statements dont have a hard time figuring it
>> out.
>> > regards, Muneem.
>> >
>> >
>> > On Fri, Apr 3, 2026 at 12:18 AM Muneem <itfllow123_at_[hidden] <mailto:
>> itfllow123_at_[hidden]>> wrote:
>> >
>> > >why is std::variant object static
>> > I made the std::variant static because the array that I was getting
>> was also a global variable, infact even when I make the class objects
>> return an array variable instead of a constant in that std::variant, the
>> rise in latency is almost 0 because the latency of std::visit itself is
>> around 470 ns. Again, context and intent are the golden words for any
>> compiler.
>> > regards, Muneem
>> >
>> > On Fri, Apr 3, 2026 at 12:14 AM Muneem <itfllow123_at_[hidden]
>> <mailto:itfllow123_at_[hidden]>> wrote:
>> >
>> > So, I am sorry for confusing you, but the think is that
>> currently, you need std::visit or branching to index homogeneous values
>> with a runtime index, this means that the context and intent given to the
>> compiler is minimal, where as for something like template inlining, the
>> compiler has a lot of context and intent, from compiler optimization flags
>> to instantiation time (which can be delayed till link time for the linker,
>> which can lead to even better optimizations), to the context of defining
>> the template definition, to the context of argument dependent lookup even,
>> which for templates is even more flexible since it can loop for it in any
>> namespace to find the most perfect match, which again gives the compiler
>> more information. The goal is to give compiler context and intent so that
>> it can find the best time to optimize (any phase in compilation or
>> linkage), and it knows as much as possible on how to optimize. here are the
>> bench marks in visual studio 2026
>> > Time (ns) for switch: 97
>> > Time (ns) for visit: 534
>> > Time (ns) for ternary: 41
>> > Time (ns) for subscript: 61
>> >
>> > On Thu, Apr 2, 2026 at 3:02 PM Marcin Jaczewski <
>> marcinjaczewski86_at_[hidden] <mailto:marcinjaczewski86_at_[hidden]>> wrote:
>> >
>> > czw., 2 kwi 2026 o 11:45 Muneem <itfllow123_at_[hidden]
>> <mailto:itfllow123_at_[hidden]>> napisał(a):
>> > >
>> > > That's the point, std variant isn't usable here, nor is
>> any other polymorphic technique, hence we need the features that I
>> described in the standard. The results consistently show that subscription
>> and ternary statements can be easier to optimize than the others, which
>> show the need for a new construct to deal with homegenous sets of values.
>> Current branching techniques work, but can lead to verbose code, is adhoc,
>> and is just not efficient when compared to subcripting; thats why we need
>> subscripting like capabilities for homegenous sets of values. Context is
>> the most important for a compiler because it can reorder expressions in
>> that context in many ways (as long as the observable behaviour is the same,
>> refer to the standard for a defintion of "observable behaviour" if you
>> would like).
>> > > Std::variant and polymorphism fail at fixing everything,
>> and I think that we can fix this specific issue of homogenous lists with
>> this new feature. The goal is to give the compiler context and intent.
>> > > Regards, Muneem.
>> > >
>> >
>> > You repeat only buzzwords here. I ask for real life
>> examples of code
>> > that have this problem and how your "solution" fix it
>> exactly.
>> > It looks like an XY problem. Like, you have a contradiction
>> there:
>> > "polymorphic" and "homegenous".
>> > Why do you try to use techine that is for different kinds
>> of problems
>> > and claim it does not work?
>> > Did you try to use proper tools for problems you have?
>> >
>> > > On Thu, 2 Apr 2026, 1:11 pm Marcin Jaczewski, <
>> marcinjaczewski86_at_[hidden] <mailto:marcinjaczewski86_at_[hidden]>> wrote:
>> > >>
>> > >> czw., 2 kwi 2026 o 09:05 Muneem via Std-Proposals
>> > >> <std-proposals_at_[hidden] <mailto:
>> std-proposals_at_[hidden]>> napisał(a):
>> > >> >
>> > >> > hi!
>> > >> > Your point is partly correct, but this issue is quite
>> prevalent, below are my branches from multiple sources:
>> > >> > this is the updated code:
>> > >> > #include <variant>
>> > >> > #include <iostream>
>> > >> > #include <chrono>
>> > >> > #include <ctime>
>> > >> > #include <iomanip>
>> > >> > #include<array>
>> > >> > std::array<int, 3> array_1={1,2,3};
>> > >> >
>> > >> > struct A { int get() { return array_1[0]; } };
>> > >> > struct B { int get() { return array_1[1]; } };
>> > >> > struct C { int get() { return array_1[2]; } };
>> > >> >
>> > >> > struct data_accessed_through_visit {
>> > >> > static std::variant<A, B, C> obj;
>> > >> >
>> > >> > inline int operator()(int) {
>> > >> > return std::visit([](auto&& arg) {
>> > >> > return arg.get();
>> > >> > }, obj);
>> > >> > }
>> > >> > };
>> > >> > std::variant<A, B, C>
>> data_accessed_through_visit::obj=C{};
>> > >> > int user_index = 0;
>> > >>
>> > >> I can hold you right now, you use a static object for the
>> > >> `std::variant`, this means you test different thing than
>> in other
>> > >> cases.
>> > >> This can affect code gen and what optimizer can see and
>> guarantee.
>> > >>
>> > >> Besides, why use `variant` here? It was not designed for
>> this but to
>> > >> store different types in the same memory location.
>> > >> And none of this is used there, only making it harder to
>> compiler to
>> > >> see what is going on.
>> > >>
>> > >> Why not use simple pointers there? And if you need to do
>> some
>> > >> calculations function pointers?
>> > >>
>> > >> >
>> > >> > struct data_ternary {
>> > >> > inline int operator()(int index) {
>> > >> > return (index == 0) ? array_1[0] : (index ==
>> 1) ? array_1[1] : (index == 1) ? array_1[2] : -1;
>> > >> > }
>> > >> > };
>> > >> >
>> > >> > struct data_switched {
>> > >> > inline int operator()(int index) {
>> > >> > switch(index) {
>> > >> > case 0: return array_1[0];
>> > >> > case 1: return array_1[1];
>> > >> > case 2: return array_1[2];
>> > >> > default: return -1;
>> > >> > }
>> > >> > }
>> > >> > };
>> > >> >
>> > >> > struct data_indexing {
>> > >> > inline int operator()(int index) {
>> > >> > return array_1[index];
>> > >> > }
>> > >> > };
>> > >> >
>> > >> >
>> > >> >
>> > >> > volatile int x = 0;
>> > >> > constexpr uint64_t loop_count=10000;
>> > >> > static void measure_switch() {
>> > >> > data_switched obj;
>> > >> > for (int i=0; i++<loop_count;) {
>> > >> > x = obj(user_index);
>> > >> > }
>> > >> > }
>> > >> >
>> > >> > static void measure_visit() {
>> > >> > data_accessed_through_visit obj;
>> > >> > for (int i=0; i++<loop_count;) {
>> > >> > x = obj(user_index);
>> > >> > }
>> > >> > }
>> > >>
>> > >> Why not use:
>> > >> ```
>> > >> static void measure_visit() {
>> > >> data_accessed_through_visit obj
>> > >> return std::visit([](auto&& arg) {
>> > >> for (int i=0; i++<loop_count;) {
>> > >> x = arg.get();
>> > >> }
>> > >> }, obj);
>> > >> }
>> > >> ```
>> > >> And this is more similar to what the compiler sees in
>> other test cases.
>> > >>
>> > >> >
>> > >> > static void measure_ternary() {
>> > >> > data_ternary obj;
>> > >> > for (int i=0; i++<loop_count;) {
>> > >> > x = obj(user_index);
>> > >> > }
>> > >> > }
>> > >> > static void measure_indexing() {
>> > >> > data_indexing obj;
>> > >> > for (int i=0; i++<loop_count;) {
>> > >> > x = obj(user_index);
>> > >> > }
>> > >> > }
>> > >> >
>> > >> > template<typename func_t>
>> > >> > void call_func(func_t callable_obj, int arg){
>> > >> > const auto start =
>> std::chrono::steady_clock::now();
>> > >> >
>> > >> > constexpr int how_much_to_loop=1000;
>> > >> > for(int i=0; i++<how_much_to_loop;){
>> > >> > callable_obj();
>> > >> > }
>> > >> > const auto end = std::chrono::steady_clock::now();
>> > >> > auto result=
>> std::chrono::duration_cast<std::chrono::nanoseconds>(end -
>> start).count()/how_much_to_loop;
>> > >> > std::cout<<result/how_much_to_loop<<std::endl;
>> > >> >
>> > >> > }
>> > >> >
>> > >> > int main() {
>> > >> > std::cout << "Enter index (0 for A, 1 for B, 2 for
>> C): ";
>> > >> > if (!(std::cin >> user_index)) return 1;
>> > >> >
>> > >> > // Set the variant state
>> > >> > if (user_index == 0)
>> data_accessed_through_visit::obj = A{};
>> > >> > else if (user_index == 1)
>> data_accessed_through_visit::obj = B{};
>> > >> > else if (user_index == 2)
>> data_accessed_through_visit::obj = C{};
>> > >> >
>> > >> > std::cout << "Time (ns) for switch: ";
>> > >> > call_func(measure_switch, user_index);
>> > >> >
>> > >> > std::cout << "Time (ns) for visit: ";
>> > >> > call_func(measure_visit, user_index);
>> > >> >
>> > >> > std::cout << "Time (ns) for ternary: ";
>> > >> > call_func(measure_ternary, user_index);
>> > >> >
>> > >> > std::cout << "Time (ns) for subscript: ";
>> > >> > call_func(measure_indexing, user_index);
>> > >> >
>> > >> > return 0;
>> > >> > }
>> > >> > the bench marks consistently show that these syntax
>> constructs do matter (the smaller the index range is, the more the compiler
>> can flatten it and know how to branch), notice how ternary is outperforming
>> them all even though its nesting, This means that adding new syntax with
>> the sole purpose to give compilers as much information as possible is
>> actually useful.
>> > >>
>> > >> And what do you propose here exactly? diffrent
>> structures behave differently.
>> > >> How exactly it will look and work, right now you use lot
>> of vague
>> > >> statement that do not show anything.
>> > >>
>> > >> > Consider how templates and instantiation give the
>> compiler extra insight. why? because templates are instantiated at the
>> point of instantiation which can be delayed upto link time. these are the
>> benchmarks:
>> > >>
>> > >> what? did you confuse it with Link Time Optimization?
>> instantiation is
>> > >> in the Transition Unit not at link time as this would be
>> too late.
>> > >> We have Extern Templates but then compiler in TU becomes
>> blind to that
>> > >> code as it instantiationed in other TU, LTO can help
>> there but
>> > >> then we have exactly the same case if we use normal
>> functions.
>> > >>
>> > >> > benchmarks for g++:
>> > >> > Enter index (0 for A, 1 for B, 2 for C): 2
>> > >> > Time (ns) for switch: 33
>> > >> > Time (ns) for visit: 278
>> > >> > Time (ns) for ternary: 19
>> > >> > Time (ns) for subscript: 34
>> > >> > PS C:\Users\drnoo\Downloads> .\a.exe
>> > >> > Enter index (0 for A, 1 for B, 2 for C): 2
>> > >> > Time (ns) for switch: 33
>> > >> > Time (ns) for visit: 296
>> > >> > Time (ns) for ternary: 20
>> > >> > Time (ns) for subscript: 35
>> > >> > PS C:\Users\drnoo\Downloads> .\a.exe
>> > >> > Enter index (0 for A, 1 for B, 2 for C): 2
>> > >> > Time (ns) for switch: 34
>> > >> > Time (ns) for visit: 271
>> > >> > Time (ns) for ternary: 17
>> > >> > Time (ns) for subscript: 33
>> > >> > PS C:\Users\drnoo\Downloads> .\a.exe
>> > >> > Enter index (0 for A, 1 for B, 2 for C): 2
>> > >> > Time (ns) for switch: 34
>> > >> > Time (ns) for visit: 281
>> > >> > Time (ns) for ternary: 19
>> > >> > Time (ns) for subscript: 32
>> > >> > PS C:\Users\drnoo\Downloads> .\a.exe
>> > >> > Enter index (0 for A, 1 for B, 2 for C): 2
>> > >> > Time (ns) for switch: 34
>> > >> > Time (ns) for visit: 282
>> > >> > Time (ns) for ternary: 20
>> > >> > Time (ns) for subscript: 34
>> > >> > I really have to go to sleep now ( I am having some
>> issues with visual studio 2026), I Hope, it would be acceptable for me to
>> send the benchmarks for that tomorrow.
>> > >> >
>> > >> > regards, Muneem
>> > >> >
>> > >> >
>> > >> > On Thu, Apr 2, 2026 at 10:54 AM Thiago Macieira via
>> Std-Proposals <std-proposals_at_[hidden] <mailto:
>> std-proposals_at_[hidden]>> wrote:
>> > >> >>
>> > >> >> On Wednesday, 1 April 2026 21:56:44 Pacific Daylight
>> Time Muneem via Std-
>> > >> >> Proposals wrote:
>> > >> >> > /*
>> > >> >> > Time (ns) for switch: 168100
>> > >> >> > Time (ns) for visit: 3664100
>> > >> >> > Time (ns) for ternary: 190900
>> > >> >> > It keeps on getting worse!
>> > >> >> > */
>> > >> >>
>> > >> >> So far you've maybe shown that one implementation is
>> generating bad code. Have
>> > >> >> you tried others?
>> > >> >>
>> > >> >> You need to prove that this is an inherent and
>> unavoidable problem of the
>> > >> >> requirements, not that it just happened to be bad for
>> this implementation.
>> > >> >> Just quickly reading the proposed benchmark code, it
>> would seem there's no
>> > >> >> such inherent reason and you're making an unfounded
>> and probably incorrect
>> > >> >> assumption about how things actually work.
>> > >> >>
>> > >> >> In fact, I pasted a portion of your code into godbolt
>> just to see what the
>> > >> >> variant visit code, which you claim to be
>> unnecessarily slow, would look like:
>> > >> >> https://gcc.godbolt.org/z/WK5bMzcae <
>> https://gcc.godbolt.org/z/WK5bMzcae>
>> > >> >>
>> > >> >> The first thing to note in the GCC/libstdc++ pane is
>> that it does not use
>> > >> >> user_index. The compiler thinks it's a constant,
>> meaning this benchmark is
>> > >> >> faulty. And thus it has constant-propagated this
>> value and is *incredibly*
>> > >> >> efficient in doing nothing useful. MSVC did likewise.
>> > >> >>
>> > >> >> Since MSVC outputs the out-of-line copy of inlined
>> functions, we can see the
>> > >> >> operator() expansion without the proapagation of the
>> user_index constant. And
>> > >> >> it's no different than what a ternary or switch would
>> look like.
>> > >> >>
>> > >> >> In the Clang/libc++ pane, we see indirect function
>> calls. I don't know why
>> > >> >> libc++ std::variant is implemented this way, but it
>> could be why it is slow
>> > >> >> for you if you're using this implementation. If you
>> tell Clang to instead use
>> > >> >> libstdc++ (remove the last argument of the
>> command-line), the indirect
>> > >> >> function call disappears and we see an unrolled loop
>> of loading the value 10.
>> > >> >> That would mean Clang is even more efficient at doing
>> nothing.
>> > >> >>
>> > >> >> Conclusion: it looks like your assumption that there
>> is a problem to be solved
>> > >> >> is faulty. There is no problem.
>> > >> >>
>> > >> >> --
>> > >> >> Thiago Macieira - thiago (AT) macieira.info <
>> http://macieira.info> - thiago (AT) kde.org <http://kde.org>
>> > >> >> Principal Engineer - Intel Data Center - Platform &
>> Sys. Eng.
>> > >> >> --
>> > >> >> Std-Proposals mailing list
>> > >> >> Std-Proposals_at_[hidden] <mailto:
>> Std-Proposals_at_[hidden]>
>> > >> >>
>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals <
>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals>
>> > >> >
>> > >> > --
>> > >> > Std-Proposals mailing list
>> > >> > Std-Proposals_at_[hidden] <mailto:
>> Std-Proposals_at_[hidden]>
>> > >> >
>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals <
>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals>
>> >
>> >
>>
>>
that I didn't get your question so I wanted to answer it properly):
hope this is what you asked for:
// Current C++:
#include <variant>
#include <iostream>
#include<string>
struct A { int get() { return 1; } };
struct B { std::string get() { return "love C++";} };
struct C { double get() { return 2.1; } };
struct data_accessed_through_visit {
static std::variant<A, B, C> obj;
inline void operator()(int) {
std::visit([](auto&& arg) {
std::cout<< arg.get();
}, obj);
}
};
//proves that std visit is too verbose and slow (too hard/impractical to
optimize) for this task
what about switches, enjoy the look for yourself(while imagining the look
if this were repeated multiple times over hundreds of case statements):
struct data_switched {
inline void operator()(int index) {
switch(index) {
case 0: std::cout<<1;
case 1: std::cout<<"love C++";
case 2: std::cout<<2;
default: return -1;
}
}
};
fix:
struct data_indexed {
{ 1, "love C++", 2} return_indexed_value(std::size_t index);
//A last value can be added to express what is to be returned if the
index is out of range, that would be awesome because the compiler can even
optimize that check.
inline void operator()(int index) {
std::cout<<return_indexed_value(index);
}
}
sorry for sending two emails at twice, i just had to clarify myself one
last time.
On Fri, Apr 3, 2026 at 4:09 AM Muneem <itfllow123_at_[hidden]> wrote:
> I am really really sorry for sending a flood of emails. As for the
> example, it's in the end of this proposal file, please feel free to change
> it physical or by contributing to it on my GitHub. The reason that I
> thought sending a lot of emails is a good idea is because we have short
> attention spans so I wanted to instantly answer your questions while you
> still have it in your minds. If you want, we can calloborate on my GitHub
> on this feature or on your's (I am comfortable either way and on any time
> of the day (except when I am sleeping))
> The link and the file is:
>
> https://github.com/HjaldrKhilji/Future-potential-ISO-porposals/blob/main/Extension%20to%20runtime%20polymorphism.txt
> The example clearly shows that just like the spaceship operator, showing
> intent is always better and valuable. I have yet to define detailed
> features of this new syntax (ie, Const and volatile qualifications, or even
> potential thread policies for a single access), this is because I don't
> know what the comcencess is. Technically, this can be solved a billion
> ways, but right now the solution is minimal because I didn't convince
> anyone to propose a solution themselves.
> Again, sorry for sending too much emails, I thought that you guys use AI
> to summarize all my points into short concise ones before reading my raw
> emails. I use this techniques when talking to foreigners that use confusing
> proverbs. Again, I should've known better so I am really really sorry.
>
>
> On Fri, 3 Apr 2026, 3:52 am Jens Maurer, <jens.maurer_at_[hidden]> wrote:
>
>> Hi!
>>
>> May I suggest that you stop your habit of sending multiple
>> rapid-fire responses to the same e-mail?
>>
>> Take a breath, take a good night's sleep, and take another
>> breath before replying to an e-mail. Then, compose that
>> e-mail as a draft, take another night, and re-read it the
>> next day. Remember, there are orders of magnitude more
>> people you want to read your e-mail than there are who
>> writes it (just you, I'd guess), so you should make it
>> as easy as possible for those people to read your
>> utterances.
>>
>> Despite the flurry of e-mails, I have not yet seen a
>> simple example showing a use of the facility you wish to
>> have (it doesn't have to compile with current compilers),
>> and a comparison with the (most obvious) code I would need
>> to write without the facility you wish to have.
>>
>> Jens
>>
>>
>> On 4/2/26 21:20, Muneem via Std-Proposals wrote:
>> > std::array<int, 3> array_1 = { 1,2,3 }; is also global variable, but
>> the ternary/subcript operator statements dont have a hard time figuring it
>> out.
>> > regards, Muneem.
>> >
>> >
>> > On Fri, Apr 3, 2026 at 12:18 AM Muneem <itfllow123_at_[hidden] <mailto:
>> itfllow123_at_[hidden]>> wrote:
>> >
>> > >why is std::variant object static
>> > I made the std::variant static because the array that I was getting
>> was also a global variable, infact even when I make the class objects
>> return an array variable instead of a constant in that std::variant, the
>> rise in latency is almost 0 because the latency of std::visit itself is
>> around 470 ns. Again, context and intent are the golden words for any
>> compiler.
>> > regards, Muneem
>> >
>> > On Fri, Apr 3, 2026 at 12:14 AM Muneem <itfllow123_at_[hidden]
>> <mailto:itfllow123_at_[hidden]>> wrote:
>> >
>> > So, I am sorry for confusing you, but the think is that
>> currently, you need std::visit or branching to index homogeneous values
>> with a runtime index, this means that the context and intent given to the
>> compiler is minimal, where as for something like template inlining, the
>> compiler has a lot of context and intent, from compiler optimization flags
>> to instantiation time (which can be delayed till link time for the linker,
>> which can lead to even better optimizations), to the context of defining
>> the template definition, to the context of argument dependent lookup even,
>> which for templates is even more flexible since it can loop for it in any
>> namespace to find the most perfect match, which again gives the compiler
>> more information. The goal is to give compiler context and intent so that
>> it can find the best time to optimize (any phase in compilation or
>> linkage), and it knows as much as possible on how to optimize. here are the
>> bench marks in visual studio 2026
>> > Time (ns) for switch: 97
>> > Time (ns) for visit: 534
>> > Time (ns) for ternary: 41
>> > Time (ns) for subscript: 61
>> >
>> > On Thu, Apr 2, 2026 at 3:02 PM Marcin Jaczewski <
>> marcinjaczewski86_at_[hidden] <mailto:marcinjaczewski86_at_[hidden]>> wrote:
>> >
>> > czw., 2 kwi 2026 o 11:45 Muneem <itfllow123_at_[hidden]
>> <mailto:itfllow123_at_[hidden]>> napisał(a):
>> > >
>> > > That's the point, std variant isn't usable here, nor is
>> any other polymorphic technique, hence we need the features that I
>> described in the standard. The results consistently show that subscription
>> and ternary statements can be easier to optimize than the others, which
>> show the need for a new construct to deal with homegenous sets of values.
>> Current branching techniques work, but can lead to verbose code, is adhoc,
>> and is just not efficient when compared to subcripting; thats why we need
>> subscripting like capabilities for homegenous sets of values. Context is
>> the most important for a compiler because it can reorder expressions in
>> that context in many ways (as long as the observable behaviour is the same,
>> refer to the standard for a defintion of "observable behaviour" if you
>> would like).
>> > > Std::variant and polymorphism fail at fixing everything,
>> and I think that we can fix this specific issue of homogenous lists with
>> this new feature. The goal is to give the compiler context and intent.
>> > > Regards, Muneem.
>> > >
>> >
>> > You repeat only buzzwords here. I ask for real life
>> examples of code
>> > that have this problem and how your "solution" fix it
>> exactly.
>> > It looks like an XY problem. Like, you have a contradiction
>> there:
>> > "polymorphic" and "homegenous".
>> > Why do you try to use techine that is for different kinds
>> of problems
>> > and claim it does not work?
>> > Did you try to use proper tools for problems you have?
>> >
>> > > On Thu, 2 Apr 2026, 1:11 pm Marcin Jaczewski, <
>> marcinjaczewski86_at_[hidden] <mailto:marcinjaczewski86_at_[hidden]>> wrote:
>> > >>
>> > >> czw., 2 kwi 2026 o 09:05 Muneem via Std-Proposals
>> > >> <std-proposals_at_[hidden] <mailto:
>> std-proposals_at_[hidden]>> napisał(a):
>> > >> >
>> > >> > hi!
>> > >> > Your point is partly correct, but this issue is quite
>> prevalent, below are my branches from multiple sources:
>> > >> > this is the updated code:
>> > >> > #include <variant>
>> > >> > #include <iostream>
>> > >> > #include <chrono>
>> > >> > #include <ctime>
>> > >> > #include <iomanip>
>> > >> > #include<array>
>> > >> > std::array<int, 3> array_1={1,2,3};
>> > >> >
>> > >> > struct A { int get() { return array_1[0]; } };
>> > >> > struct B { int get() { return array_1[1]; } };
>> > >> > struct C { int get() { return array_1[2]; } };
>> > >> >
>> > >> > struct data_accessed_through_visit {
>> > >> > static std::variant<A, B, C> obj;
>> > >> >
>> > >> > inline int operator()(int) {
>> > >> > return std::visit([](auto&& arg) {
>> > >> > return arg.get();
>> > >> > }, obj);
>> > >> > }
>> > >> > };
>> > >> > std::variant<A, B, C>
>> data_accessed_through_visit::obj=C{};
>> > >> > int user_index = 0;
>> > >>
>> > >> I can hold you right now, you use a static object for the
>> > >> `std::variant`, this means you test different thing than
>> in other
>> > >> cases.
>> > >> This can affect code gen and what optimizer can see and
>> guarantee.
>> > >>
>> > >> Besides, why use `variant` here? It was not designed for
>> this but to
>> > >> store different types in the same memory location.
>> > >> And none of this is used there, only making it harder to
>> compiler to
>> > >> see what is going on.
>> > >>
>> > >> Why not use simple pointers there? And if you need to do
>> some
>> > >> calculations function pointers?
>> > >>
>> > >> >
>> > >> > struct data_ternary {
>> > >> > inline int operator()(int index) {
>> > >> > return (index == 0) ? array_1[0] : (index ==
>> 1) ? array_1[1] : (index == 1) ? array_1[2] : -1;
>> > >> > }
>> > >> > };
>> > >> >
>> > >> > struct data_switched {
>> > >> > inline int operator()(int index) {
>> > >> > switch(index) {
>> > >> > case 0: return array_1[0];
>> > >> > case 1: return array_1[1];
>> > >> > case 2: return array_1[2];
>> > >> > default: return -1;
>> > >> > }
>> > >> > }
>> > >> > };
>> > >> >
>> > >> > struct data_indexing {
>> > >> > inline int operator()(int index) {
>> > >> > return array_1[index];
>> > >> > }
>> > >> > };
>> > >> >
>> > >> >
>> > >> >
>> > >> > volatile int x = 0;
>> > >> > constexpr uint64_t loop_count=10000;
>> > >> > static void measure_switch() {
>> > >> > data_switched obj;
>> > >> > for (int i=0; i++<loop_count;) {
>> > >> > x = obj(user_index);
>> > >> > }
>> > >> > }
>> > >> >
>> > >> > static void measure_visit() {
>> > >> > data_accessed_through_visit obj;
>> > >> > for (int i=0; i++<loop_count;) {
>> > >> > x = obj(user_index);
>> > >> > }
>> > >> > }
>> > >>
>> > >> Why not use:
>> > >> ```
>> > >> static void measure_visit() {
>> > >> data_accessed_through_visit obj
>> > >> return std::visit([](auto&& arg) {
>> > >> for (int i=0; i++<loop_count;) {
>> > >> x = arg.get();
>> > >> }
>> > >> }, obj);
>> > >> }
>> > >> ```
>> > >> And this is more similar to what the compiler sees in
>> other test cases.
>> > >>
>> > >> >
>> > >> > static void measure_ternary() {
>> > >> > data_ternary obj;
>> > >> > for (int i=0; i++<loop_count;) {
>> > >> > x = obj(user_index);
>> > >> > }
>> > >> > }
>> > >> > static void measure_indexing() {
>> > >> > data_indexing obj;
>> > >> > for (int i=0; i++<loop_count;) {
>> > >> > x = obj(user_index);
>> > >> > }
>> > >> > }
>> > >> >
>> > >> > template<typename func_t>
>> > >> > void call_func(func_t callable_obj, int arg){
>> > >> > const auto start =
>> std::chrono::steady_clock::now();
>> > >> >
>> > >> > constexpr int how_much_to_loop=1000;
>> > >> > for(int i=0; i++<how_much_to_loop;){
>> > >> > callable_obj();
>> > >> > }
>> > >> > const auto end = std::chrono::steady_clock::now();
>> > >> > auto result=
>> std::chrono::duration_cast<std::chrono::nanoseconds>(end -
>> start).count()/how_much_to_loop;
>> > >> > std::cout<<result/how_much_to_loop<<std::endl;
>> > >> >
>> > >> > }
>> > >> >
>> > >> > int main() {
>> > >> > std::cout << "Enter index (0 for A, 1 for B, 2 for
>> C): ";
>> > >> > if (!(std::cin >> user_index)) return 1;
>> > >> >
>> > >> > // Set the variant state
>> > >> > if (user_index == 0)
>> data_accessed_through_visit::obj = A{};
>> > >> > else if (user_index == 1)
>> data_accessed_through_visit::obj = B{};
>> > >> > else if (user_index == 2)
>> data_accessed_through_visit::obj = C{};
>> > >> >
>> > >> > std::cout << "Time (ns) for switch: ";
>> > >> > call_func(measure_switch, user_index);
>> > >> >
>> > >> > std::cout << "Time (ns) for visit: ";
>> > >> > call_func(measure_visit, user_index);
>> > >> >
>> > >> > std::cout << "Time (ns) for ternary: ";
>> > >> > call_func(measure_ternary, user_index);
>> > >> >
>> > >> > std::cout << "Time (ns) for subscript: ";
>> > >> > call_func(measure_indexing, user_index);
>> > >> >
>> > >> > return 0;
>> > >> > }
>> > >> > the bench marks consistently show that these syntax
>> constructs do matter (the smaller the index range is, the more the compiler
>> can flatten it and know how to branch), notice how ternary is outperforming
>> them all even though its nesting, This means that adding new syntax with
>> the sole purpose to give compilers as much information as possible is
>> actually useful.
>> > >>
>> > >> And what do you propose here exactly? diffrent
>> structures behave differently.
>> > >> How exactly it will look and work, right now you use lot
>> of vague
>> > >> statement that do not show anything.
>> > >>
>> > >> > Consider how templates and instantiation give the
>> compiler extra insight. why? because templates are instantiated at the
>> point of instantiation which can be delayed upto link time. these are the
>> benchmarks:
>> > >>
>> > >> what? did you confuse it with Link Time Optimization?
>> instantiation is
>> > >> in the Transition Unit not at link time as this would be
>> too late.
>> > >> We have Extern Templates but then compiler in TU becomes
>> blind to that
>> > >> code as it instantiationed in other TU, LTO can help
>> there but
>> > >> then we have exactly the same case if we use normal
>> functions.
>> > >>
>> > >> > benchmarks for g++:
>> > >> > Enter index (0 for A, 1 for B, 2 for C): 2
>> > >> > Time (ns) for switch: 33
>> > >> > Time (ns) for visit: 278
>> > >> > Time (ns) for ternary: 19
>> > >> > Time (ns) for subscript: 34
>> > >> > PS C:\Users\drnoo\Downloads> .\a.exe
>> > >> > Enter index (0 for A, 1 for B, 2 for C): 2
>> > >> > Time (ns) for switch: 33
>> > >> > Time (ns) for visit: 296
>> > >> > Time (ns) for ternary: 20
>> > >> > Time (ns) for subscript: 35
>> > >> > PS C:\Users\drnoo\Downloads> .\a.exe
>> > >> > Enter index (0 for A, 1 for B, 2 for C): 2
>> > >> > Time (ns) for switch: 34
>> > >> > Time (ns) for visit: 271
>> > >> > Time (ns) for ternary: 17
>> > >> > Time (ns) for subscript: 33
>> > >> > PS C:\Users\drnoo\Downloads> .\a.exe
>> > >> > Enter index (0 for A, 1 for B, 2 for C): 2
>> > >> > Time (ns) for switch: 34
>> > >> > Time (ns) for visit: 281
>> > >> > Time (ns) for ternary: 19
>> > >> > Time (ns) for subscript: 32
>> > >> > PS C:\Users\drnoo\Downloads> .\a.exe
>> > >> > Enter index (0 for A, 1 for B, 2 for C): 2
>> > >> > Time (ns) for switch: 34
>> > >> > Time (ns) for visit: 282
>> > >> > Time (ns) for ternary: 20
>> > >> > Time (ns) for subscript: 34
>> > >> > I really have to go to sleep now ( I am having some
>> issues with visual studio 2026), I Hope, it would be acceptable for me to
>> send the benchmarks for that tomorrow.
>> > >> >
>> > >> > regards, Muneem
>> > >> >
>> > >> >
>> > >> > On Thu, Apr 2, 2026 at 10:54 AM Thiago Macieira via
>> Std-Proposals <std-proposals_at_[hidden] <mailto:
>> std-proposals_at_[hidden]>> wrote:
>> > >> >>
>> > >> >> On Wednesday, 1 April 2026 21:56:44 Pacific Daylight
>> Time Muneem via Std-
>> > >> >> Proposals wrote:
>> > >> >> > /*
>> > >> >> > Time (ns) for switch: 168100
>> > >> >> > Time (ns) for visit: 3664100
>> > >> >> > Time (ns) for ternary: 190900
>> > >> >> > It keeps on getting worse!
>> > >> >> > */
>> > >> >>
>> > >> >> So far you've maybe shown that one implementation is
>> generating bad code. Have
>> > >> >> you tried others?
>> > >> >>
>> > >> >> You need to prove that this is an inherent and
>> unavoidable problem of the
>> > >> >> requirements, not that it just happened to be bad for
>> this implementation.
>> > >> >> Just quickly reading the proposed benchmark code, it
>> would seem there's no
>> > >> >> such inherent reason and you're making an unfounded
>> and probably incorrect
>> > >> >> assumption about how things actually work.
>> > >> >>
>> > >> >> In fact, I pasted a portion of your code into godbolt
>> just to see what the
>> > >> >> variant visit code, which you claim to be
>> unnecessarily slow, would look like:
>> > >> >> https://gcc.godbolt.org/z/WK5bMzcae <
>> https://gcc.godbolt.org/z/WK5bMzcae>
>> > >> >>
>> > >> >> The first thing to note in the GCC/libstdc++ pane is
>> that it does not use
>> > >> >> user_index. The compiler thinks it's a constant,
>> meaning this benchmark is
>> > >> >> faulty. And thus it has constant-propagated this
>> value and is *incredibly*
>> > >> >> efficient in doing nothing useful. MSVC did likewise.
>> > >> >>
>> > >> >> Since MSVC outputs the out-of-line copy of inlined
>> functions, we can see the
>> > >> >> operator() expansion without the proapagation of the
>> user_index constant. And
>> > >> >> it's no different than what a ternary or switch would
>> look like.
>> > >> >>
>> > >> >> In the Clang/libc++ pane, we see indirect function
>> calls. I don't know why
>> > >> >> libc++ std::variant is implemented this way, but it
>> could be why it is slow
>> > >> >> for you if you're using this implementation. If you
>> tell Clang to instead use
>> > >> >> libstdc++ (remove the last argument of the
>> command-line), the indirect
>> > >> >> function call disappears and we see an unrolled loop
>> of loading the value 10.
>> > >> >> That would mean Clang is even more efficient at doing
>> nothing.
>> > >> >>
>> > >> >> Conclusion: it looks like your assumption that there
>> is a problem to be solved
>> > >> >> is faulty. There is no problem.
>> > >> >>
>> > >> >> --
>> > >> >> Thiago Macieira - thiago (AT) macieira.info <
>> http://macieira.info> - thiago (AT) kde.org <http://kde.org>
>> > >> >> Principal Engineer - Intel Data Center - Platform &
>> Sys. Eng.
>> > >> >> --
>> > >> >> Std-Proposals mailing list
>> > >> >> Std-Proposals_at_[hidden] <mailto:
>> Std-Proposals_at_[hidden]>
>> > >> >>
>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals <
>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals>
>> > >> >
>> > >> > --
>> > >> > Std-Proposals mailing list
>> > >> > Std-Proposals_at_[hidden] <mailto:
>> Std-Proposals_at_[hidden]>
>> > >> >
>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals <
>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals>
>> >
>> >
>>
>>
Received on 2026-04-02 23:53:47
