C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Fwd: Extension to runtime polymorphism proposed

From: Muneem <itfllow123_at_[hidden]>
Date: Thu, 2 Apr 2026 12:24:06 +0500
Not to forget that the main thing is to make branching less verbose because
currently indexing helps for lists of heterogenous values, but it dosent
work for homegenous ones, so it's not a proposal for a fancier branching
mechanism, but a less adhoc for specifically "indexing through homegenous
lists".

On Thu, 2 Apr 2026, 12:16 pm Muneem, <itfllow123_at_[hidden]> wrote:

> One last note:
> the speed when -o3 is set on g++ is :
> PS C:\Users\drnoo\Downloads> .\a.exe
> Enter index (0 for A, 1 for B, 2 for C): 2
> Time (ns) for switch: 6
> Time (ns) for visit: 11
> Time (ns) for ternary: 2
> Time (ns) for subscript: 2
> PS C:\Users\drnoo\Downloads> .\a.exe
> Enter index (0 for A, 1 for B, 2 for C): 2
> Time (ns) for switch: 6
> Time (ns) for visit: 10
> Time (ns) for ternary: 2
> Time (ns) for subscript: 2
> PS C:\Users\drnoo\Downloads> .\a.exe
> Enter index (0 for A, 1 for B, 2 for C): 2
> Time (ns) for switch: 6
> Time (ns) for visit: 11
> Time (ns) for ternary: 2
> Time (ns) for subscript: 2
> PS C:\Users\drnoo\Downloads> .\a.exe
> Enter index (0 for A, 1 for B, 2 for C): 2
> Time (ns) for switch: 6
> Time (ns) for visit: 11
> Time (ns) for ternary: 2
> Time (ns) for subscript: 2
> PS C:\Users\drnoo\Downloads> .\a.exe
> Enter index (0 for A, 1 for B, 2 for C): 2
> Time (ns) for switch: 6
> Time (ns) for visit: 11
> Time (ns) for ternary: 2
> Time (ns) for subscript: 2
> (please ignore the wierd "drnoo" part, it's just a username that my father
> chose)
>
> On Thu, Apr 2, 2026 at 12:08 PM Muneem <itfllow123_at_[hidden]> wrote:
>
>> My last note for today: I really really hope that we can continue this
>> discussion when I wake up, and my feature won't fix everything, but it
>> would fix branching to get a single value in homogeneous lists at runtime.
>> The reason that I use heterogeneous pairs is because the feature isn't
>> supported yet.
>>
>> On Thu, Apr 2, 2026 at 12:06 PM Muneem <itfllow123_at_[hidden]> wrote:
>>
>>> Also, I always have std::cin to make sure that the compiler never cheats!
>>>
>>> On Thu, Apr 2, 2026 at 12:05 PM Muneem <itfllow123_at_[hidden]> wrote:
>>>
>>>> hi!
>>>> Your point is partly correct, but this issue is quite prevalent, below
>>>> are my branches from multiple sources:
>>>> this is the updated code:
>>>> #include <variant>
>>>> #include <iostream>
>>>> #include <chrono>
>>>> #include <ctime>
>>>> #include <iomanip>
>>>> #include<array>
>>>> std::array<int, 3> array_1={1,2,3};
>>>>
>>>> struct A { int get() { return array_1[0]; } };
>>>> struct B { int get() { return array_1[1]; } };
>>>> struct C { int get() { return array_1[2]; } };
>>>>
>>>> struct data_accessed_through_visit {
>>>> static std::variant<A, B, C> obj;
>>>>
>>>> inline int operator()(int) {
>>>> return std::visit([](auto&& arg) {
>>>> return arg.get();
>>>> }, obj);
>>>> }
>>>> };
>>>> std::variant<A, B, C> data_accessed_through_visit::obj=C{};
>>>> int user_index = 0;
>>>>
>>>> struct data_ternary {
>>>> inline int operator()(int index) {
>>>> return (index == 0) ? array_1[0] : (index == 1) ? array_1[1] :
>>>> (index == 1) ? array_1[2] : -1;
>>>> }
>>>> };
>>>>
>>>> struct data_switched {
>>>> inline int operator()(int index) {
>>>> switch(index) {
>>>> case 0: return array_1[0];
>>>> case 1: return array_1[1];
>>>> case 2: return array_1[2];
>>>> default: return -1;
>>>> }
>>>> }
>>>> };
>>>>
>>>> struct data_indexing {
>>>> inline int operator()(int index) {
>>>> return array_1[index];
>>>> }
>>>> };
>>>>
>>>>
>>>>
>>>> volatile int x = 0;
>>>> constexpr uint64_t loop_count=10000;
>>>> static void measure_switch() {
>>>> data_switched obj;
>>>> for (int i=0; i++<loop_count;) {
>>>> x = obj(user_index);
>>>> }
>>>> }
>>>>
>>>> static void measure_visit() {
>>>> data_accessed_through_visit obj;
>>>> for (int i=0; i++<loop_count;) {
>>>> x = obj(user_index);
>>>> }
>>>> }
>>>>
>>>> static void measure_ternary() {
>>>> data_ternary obj;
>>>> for (int i=0; i++<loop_count;) {
>>>> x = obj(user_index);
>>>> }
>>>> }
>>>> static void measure_indexing() {
>>>> data_indexing obj;
>>>> for (int i=0; i++<loop_count;) {
>>>> x = obj(user_index);
>>>> }
>>>> }
>>>>
>>>> template<typename func_t>
>>>> void call_func(func_t callable_obj, int arg){
>>>> const auto start = std::chrono::steady_clock::now();
>>>>
>>>> constexpr int how_much_to_loop=1000;
>>>> for(int i=0; i++<how_much_to_loop;){
>>>> callable_obj();
>>>> }
>>>> const auto end = std::chrono::steady_clock::now();
>>>> auto result=
>>>> std::chrono::duration_cast<std::chrono::nanoseconds>(end -
>>>> start).count()/how_much_to_loop;
>>>> std::cout<<result/how_much_to_loop<<std::endl;
>>>>
>>>> }
>>>>
>>>> int main() {
>>>> std::cout << "Enter index (0 for A, 1 for B, 2 for C): ";
>>>> if (!(std::cin >> user_index)) return 1;
>>>>
>>>> // Set the variant state
>>>> if (user_index == 0) data_accessed_through_visit::obj = A{};
>>>> else if (user_index == 1) data_accessed_through_visit::obj = B{};
>>>> else if (user_index == 2) data_accessed_through_visit::obj = C{};
>>>>
>>>> std::cout << "Time (ns) for switch: ";
>>>> call_func(measure_switch, user_index);
>>>>
>>>> std::cout << "Time (ns) for visit: ";
>>>> call_func(measure_visit, user_index);
>>>>
>>>> std::cout << "Time (ns) for ternary: ";
>>>> call_func(measure_ternary, user_index);
>>>>
>>>> std::cout << "Time (ns) for subscript: ";
>>>> call_func(measure_indexing, user_index);
>>>>
>>>> return 0;
>>>> }
>>>> the bench marks consistently show that these syntax constructs do
>>>> matter (the smaller the index range is, the more the compiler can flatten
>>>> it and know how to branch), notice how ternary is outperforming them all
>>>> even though its nesting, This means that adding new syntax with the sole
>>>> purpose to give compilers as much information as possible is actually
>>>> useful. Consider how templates and instantiation give the compiler extra
>>>> insight. why? because templates are instantiated at the point of
>>>> instantiation which can be delayed upto link time. these are the benchmarks:
>>>> benchmarks for g++:
>>>> Enter index (0 for A, 1 for B, 2 for C): 2
>>>> Time (ns) for switch: 33
>>>> Time (ns) for visit: 278
>>>> Time (ns) for ternary: 19
>>>> Time (ns) for subscript: 34
>>>> PS C:\Users\drnoo\Downloads> .\a.exe
>>>> Enter index (0 for A, 1 for B, 2 for C): 2
>>>> Time (ns) for switch: 33
>>>> Time (ns) for visit: 296
>>>> Time (ns) for ternary: 20
>>>> Time (ns) for subscript: 35
>>>> PS C:\Users\drnoo\Downloads> .\a.exe
>>>> Enter index (0 for A, 1 for B, 2 for C): 2
>>>> Time (ns) for switch: 34
>>>> Time (ns) for visit: 271
>>>> Time (ns) for ternary: 17
>>>> Time (ns) for subscript: 33
>>>> PS C:\Users\drnoo\Downloads> .\a.exe
>>>> Enter index (0 for A, 1 for B, 2 for C): 2
>>>> Time (ns) for switch: 34
>>>> Time (ns) for visit: 281
>>>> Time (ns) for ternary: 19
>>>> Time (ns) for subscript: 32
>>>> PS C:\Users\drnoo\Downloads> .\a.exe
>>>> Enter index (0 for A, 1 for B, 2 for C): 2
>>>> Time (ns) for switch: 34
>>>> Time (ns) for visit: 282
>>>> Time (ns) for ternary: 20
>>>> Time (ns) for subscript: 34
>>>> I really have to go to sleep now ( I am having some issues with visual
>>>> studio 2026), I Hope, it would be acceptable for me to send the benchmarks
>>>> for that tomorrow.
>>>>
>>>> regards, Muneem
>>>>
>>>>
>>>> On Thu, Apr 2, 2026 at 10:54 AM Thiago Macieira via Std-Proposals <
>>>> std-proposals_at_[hidden]> wrote:
>>>>
>>>>> On Wednesday, 1 April 2026 21:56:44 Pacific Daylight Time Muneem via
>>>>> Std-
>>>>> Proposals wrote:
>>>>> > /*
>>>>> > Time (ns) for switch: 168100
>>>>> > Time (ns) for visit: 3664100
>>>>> > Time (ns) for ternary: 190900
>>>>> > It keeps on getting worse!
>>>>> > */
>>>>>
>>>>> So far you've maybe shown that one implementation is generating bad
>>>>> code. Have
>>>>> you tried others?
>>>>>
>>>>> You need to prove that this is an inherent and unavoidable problem of
>>>>> the
>>>>> requirements, not that it just happened to be bad for this
>>>>> implementation.
>>>>> Just quickly reading the proposed benchmark code, it would seem
>>>>> there's no
>>>>> such inherent reason and you're making an unfounded and probably
>>>>> incorrect
>>>>> assumption about how things actually work.
>>>>>
>>>>> In fact, I pasted a portion of your code into godbolt just to see what
>>>>> the
>>>>> variant visit code, which you claim to be unnecessarily slow, would
>>>>> look like:
>>>>> https://gcc.godbolt.org/z/WK5bMzcae
>>>>>
>>>>> The first thing to note in the GCC/libstdc++ pane is that it does not
>>>>> use
>>>>> user_index. The compiler thinks it's a constant, meaning this
>>>>> benchmark is
>>>>> faulty. And thus it has constant-propagated this value and is
>>>>> *incredibly*
>>>>> efficient in doing nothing useful. MSVC did likewise.
>>>>>
>>>>> Since MSVC outputs the out-of-line copy of inlined functions, we can
>>>>> see the
>>>>> operator() expansion without the proapagation of the user_index
>>>>> constant. And
>>>>> it's no different than what a ternary or switch would look like.
>>>>>
>>>>> In the Clang/libc++ pane, we see indirect function calls. I don't know
>>>>> why
>>>>> libc++ std::variant is implemented this way, but it could be why it is
>>>>> slow
>>>>> for you if you're using this implementation. If you tell Clang to
>>>>> instead use
>>>>> libstdc++ (remove the last argument of the command-line), the indirect
>>>>> function call disappears and we see an unrolled loop of loading the
>>>>> value 10.
>>>>> That would mean Clang is even more efficient at doing nothing.
>>>>>
>>>>> Conclusion: it looks like your assumption that there is a problem to
>>>>> be solved
>>>>> is faulty. There is no problem.
>>>>>
>>>>> --
>>>>> Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
>>>>> Principal Engineer - Intel Data Center - Platform & Sys. Eng.
>>>>> --
>>>>> Std-Proposals mailing list
>>>>> Std-Proposals_at_[hidden]
>>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>>
>>>>

Received on 2026-04-02 07:24:25