Date: Thu, 2 Apr 2026 12:16:48 +0500
One last note:
the speed when -o3 is set on g++ is :
PS C:\Users\drnoo\Downloads> .\a.exe
Enter index (0 for A, 1 for B, 2 for C): 2
Time (ns) for switch: 6
Time (ns) for visit: 11
Time (ns) for ternary: 2
Time (ns) for subscript: 2
PS C:\Users\drnoo\Downloads> .\a.exe
Enter index (0 for A, 1 for B, 2 for C): 2
Time (ns) for switch: 6
Time (ns) for visit: 10
Time (ns) for ternary: 2
Time (ns) for subscript: 2
PS C:\Users\drnoo\Downloads> .\a.exe
Enter index (0 for A, 1 for B, 2 for C): 2
Time (ns) for switch: 6
Time (ns) for visit: 11
Time (ns) for ternary: 2
Time (ns) for subscript: 2
PS C:\Users\drnoo\Downloads> .\a.exe
Enter index (0 for A, 1 for B, 2 for C): 2
Time (ns) for switch: 6
Time (ns) for visit: 11
Time (ns) for ternary: 2
Time (ns) for subscript: 2
PS C:\Users\drnoo\Downloads> .\a.exe
Enter index (0 for A, 1 for B, 2 for C): 2
Time (ns) for switch: 6
Time (ns) for visit: 11
Time (ns) for ternary: 2
Time (ns) for subscript: 2
(please ignore the wierd "drnoo" part, it's just a username that my father
chose)
On Thu, Apr 2, 2026 at 12:08 PM Muneem <itfllow123_at_[hidden]> wrote:
> My last note for today: I really really hope that we can continue this
> discussion when I wake up, and my feature won't fix everything, but it
> would fix branching to get a single value in homogeneous lists at runtime.
> The reason that I use heterogeneous pairs is because the feature isn't
> supported yet.
>
> On Thu, Apr 2, 2026 at 12:06 PM Muneem <itfllow123_at_[hidden]> wrote:
>
>> Also, I always have std::cin to make sure that the compiler never cheats!
>>
>> On Thu, Apr 2, 2026 at 12:05 PM Muneem <itfllow123_at_[hidden]> wrote:
>>
>>> hi!
>>> Your point is partly correct, but this issue is quite prevalent, below
>>> are my branches from multiple sources:
>>> this is the updated code:
>>> #include <variant>
>>> #include <iostream>
>>> #include <chrono>
>>> #include <ctime>
>>> #include <iomanip>
>>> #include<array>
>>> std::array<int, 3> array_1={1,2,3};
>>>
>>> struct A { int get() { return array_1[0]; } };
>>> struct B { int get() { return array_1[1]; } };
>>> struct C { int get() { return array_1[2]; } };
>>>
>>> struct data_accessed_through_visit {
>>> static std::variant<A, B, C> obj;
>>>
>>> inline int operator()(int) {
>>> return std::visit([](auto&& arg) {
>>> return arg.get();
>>> }, obj);
>>> }
>>> };
>>> std::variant<A, B, C> data_accessed_through_visit::obj=C{};
>>> int user_index = 0;
>>>
>>> struct data_ternary {
>>> inline int operator()(int index) {
>>> return (index == 0) ? array_1[0] : (index == 1) ? array_1[1] :
>>> (index == 1) ? array_1[2] : -1;
>>> }
>>> };
>>>
>>> struct data_switched {
>>> inline int operator()(int index) {
>>> switch(index) {
>>> case 0: return array_1[0];
>>> case 1: return array_1[1];
>>> case 2: return array_1[2];
>>> default: return -1;
>>> }
>>> }
>>> };
>>>
>>> struct data_indexing {
>>> inline int operator()(int index) {
>>> return array_1[index];
>>> }
>>> };
>>>
>>>
>>>
>>> volatile int x = 0;
>>> constexpr uint64_t loop_count=10000;
>>> static void measure_switch() {
>>> data_switched obj;
>>> for (int i=0; i++<loop_count;) {
>>> x = obj(user_index);
>>> }
>>> }
>>>
>>> static void measure_visit() {
>>> data_accessed_through_visit obj;
>>> for (int i=0; i++<loop_count;) {
>>> x = obj(user_index);
>>> }
>>> }
>>>
>>> static void measure_ternary() {
>>> data_ternary obj;
>>> for (int i=0; i++<loop_count;) {
>>> x = obj(user_index);
>>> }
>>> }
>>> static void measure_indexing() {
>>> data_indexing obj;
>>> for (int i=0; i++<loop_count;) {
>>> x = obj(user_index);
>>> }
>>> }
>>>
>>> template<typename func_t>
>>> void call_func(func_t callable_obj, int arg){
>>> const auto start = std::chrono::steady_clock::now();
>>>
>>> constexpr int how_much_to_loop=1000;
>>> for(int i=0; i++<how_much_to_loop;){
>>> callable_obj();
>>> }
>>> const auto end = std::chrono::steady_clock::now();
>>> auto result=
>>> std::chrono::duration_cast<std::chrono::nanoseconds>(end -
>>> start).count()/how_much_to_loop;
>>> std::cout<<result/how_much_to_loop<<std::endl;
>>>
>>> }
>>>
>>> int main() {
>>> std::cout << "Enter index (0 for A, 1 for B, 2 for C): ";
>>> if (!(std::cin >> user_index)) return 1;
>>>
>>> // Set the variant state
>>> if (user_index == 0) data_accessed_through_visit::obj = A{};
>>> else if (user_index == 1) data_accessed_through_visit::obj = B{};
>>> else if (user_index == 2) data_accessed_through_visit::obj = C{};
>>>
>>> std::cout << "Time (ns) for switch: ";
>>> call_func(measure_switch, user_index);
>>>
>>> std::cout << "Time (ns) for visit: ";
>>> call_func(measure_visit, user_index);
>>>
>>> std::cout << "Time (ns) for ternary: ";
>>> call_func(measure_ternary, user_index);
>>>
>>> std::cout << "Time (ns) for subscript: ";
>>> call_func(measure_indexing, user_index);
>>>
>>> return 0;
>>> }
>>> the bench marks consistently show that these syntax constructs do matter
>>> (the smaller the index range is, the more the compiler can flatten it and
>>> know how to branch), notice how ternary is outperforming them all even
>>> though its nesting, This means that adding new syntax with the sole purpose
>>> to give compilers as much information as possible is actually useful.
>>> Consider how templates and instantiation give the compiler extra insight.
>>> why? because templates are instantiated at the point of instantiation which
>>> can be delayed upto link time. these are the benchmarks:
>>> benchmarks for g++:
>>> Enter index (0 for A, 1 for B, 2 for C): 2
>>> Time (ns) for switch: 33
>>> Time (ns) for visit: 278
>>> Time (ns) for ternary: 19
>>> Time (ns) for subscript: 34
>>> PS C:\Users\drnoo\Downloads> .\a.exe
>>> Enter index (0 for A, 1 for B, 2 for C): 2
>>> Time (ns) for switch: 33
>>> Time (ns) for visit: 296
>>> Time (ns) for ternary: 20
>>> Time (ns) for subscript: 35
>>> PS C:\Users\drnoo\Downloads> .\a.exe
>>> Enter index (0 for A, 1 for B, 2 for C): 2
>>> Time (ns) for switch: 34
>>> Time (ns) for visit: 271
>>> Time (ns) for ternary: 17
>>> Time (ns) for subscript: 33
>>> PS C:\Users\drnoo\Downloads> .\a.exe
>>> Enter index (0 for A, 1 for B, 2 for C): 2
>>> Time (ns) for switch: 34
>>> Time (ns) for visit: 281
>>> Time (ns) for ternary: 19
>>> Time (ns) for subscript: 32
>>> PS C:\Users\drnoo\Downloads> .\a.exe
>>> Enter index (0 for A, 1 for B, 2 for C): 2
>>> Time (ns) for switch: 34
>>> Time (ns) for visit: 282
>>> Time (ns) for ternary: 20
>>> Time (ns) for subscript: 34
>>> I really have to go to sleep now ( I am having some issues with visual
>>> studio 2026), I Hope, it would be acceptable for me to send the benchmarks
>>> for that tomorrow.
>>>
>>> regards, Muneem
>>>
>>>
>>> On Thu, Apr 2, 2026 at 10:54 AM Thiago Macieira via Std-Proposals <
>>> std-proposals_at_[hidden]> wrote:
>>>
>>>> On Wednesday, 1 April 2026 21:56:44 Pacific Daylight Time Muneem via
>>>> Std-
>>>> Proposals wrote:
>>>> > /*
>>>> > Time (ns) for switch: 168100
>>>> > Time (ns) for visit: 3664100
>>>> > Time (ns) for ternary: 190900
>>>> > It keeps on getting worse!
>>>> > */
>>>>
>>>> So far you've maybe shown that one implementation is generating bad
>>>> code. Have
>>>> you tried others?
>>>>
>>>> You need to prove that this is an inherent and unavoidable problem of
>>>> the
>>>> requirements, not that it just happened to be bad for this
>>>> implementation.
>>>> Just quickly reading the proposed benchmark code, it would seem there's
>>>> no
>>>> such inherent reason and you're making an unfounded and probably
>>>> incorrect
>>>> assumption about how things actually work.
>>>>
>>>> In fact, I pasted a portion of your code into godbolt just to see what
>>>> the
>>>> variant visit code, which you claim to be unnecessarily slow, would
>>>> look like:
>>>> https://gcc.godbolt.org/z/WK5bMzcae
>>>>
>>>> The first thing to note in the GCC/libstdc++ pane is that it does not
>>>> use
>>>> user_index. The compiler thinks it's a constant, meaning this benchmark
>>>> is
>>>> faulty. And thus it has constant-propagated this value and is
>>>> *incredibly*
>>>> efficient in doing nothing useful. MSVC did likewise.
>>>>
>>>> Since MSVC outputs the out-of-line copy of inlined functions, we can
>>>> see the
>>>> operator() expansion without the proapagation of the user_index
>>>> constant. And
>>>> it's no different than what a ternary or switch would look like.
>>>>
>>>> In the Clang/libc++ pane, we see indirect function calls. I don't know
>>>> why
>>>> libc++ std::variant is implemented this way, but it could be why it is
>>>> slow
>>>> for you if you're using this implementation. If you tell Clang to
>>>> instead use
>>>> libstdc++ (remove the last argument of the command-line), the indirect
>>>> function call disappears and we see an unrolled loop of loading the
>>>> value 10.
>>>> That would mean Clang is even more efficient at doing nothing.
>>>>
>>>> Conclusion: it looks like your assumption that there is a problem to be
>>>> solved
>>>> is faulty. There is no problem.
>>>>
>>>> --
>>>> Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
>>>> Principal Engineer - Intel Data Center - Platform & Sys. Eng.
>>>> --
>>>> Std-Proposals mailing list
>>>> Std-Proposals_at_[hidden]
>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>
>>>
the speed when -o3 is set on g++ is :
PS C:\Users\drnoo\Downloads> .\a.exe
Enter index (0 for A, 1 for B, 2 for C): 2
Time (ns) for switch: 6
Time (ns) for visit: 11
Time (ns) for ternary: 2
Time (ns) for subscript: 2
PS C:\Users\drnoo\Downloads> .\a.exe
Enter index (0 for A, 1 for B, 2 for C): 2
Time (ns) for switch: 6
Time (ns) for visit: 10
Time (ns) for ternary: 2
Time (ns) for subscript: 2
PS C:\Users\drnoo\Downloads> .\a.exe
Enter index (0 for A, 1 for B, 2 for C): 2
Time (ns) for switch: 6
Time (ns) for visit: 11
Time (ns) for ternary: 2
Time (ns) for subscript: 2
PS C:\Users\drnoo\Downloads> .\a.exe
Enter index (0 for A, 1 for B, 2 for C): 2
Time (ns) for switch: 6
Time (ns) for visit: 11
Time (ns) for ternary: 2
Time (ns) for subscript: 2
PS C:\Users\drnoo\Downloads> .\a.exe
Enter index (0 for A, 1 for B, 2 for C): 2
Time (ns) for switch: 6
Time (ns) for visit: 11
Time (ns) for ternary: 2
Time (ns) for subscript: 2
(please ignore the wierd "drnoo" part, it's just a username that my father
chose)
On Thu, Apr 2, 2026 at 12:08 PM Muneem <itfllow123_at_[hidden]> wrote:
> My last note for today: I really really hope that we can continue this
> discussion when I wake up, and my feature won't fix everything, but it
> would fix branching to get a single value in homogeneous lists at runtime.
> The reason that I use heterogeneous pairs is because the feature isn't
> supported yet.
>
> On Thu, Apr 2, 2026 at 12:06 PM Muneem <itfllow123_at_[hidden]> wrote:
>
>> Also, I always have std::cin to make sure that the compiler never cheats!
>>
>> On Thu, Apr 2, 2026 at 12:05 PM Muneem <itfllow123_at_[hidden]> wrote:
>>
>>> hi!
>>> Your point is partly correct, but this issue is quite prevalent, below
>>> are my branches from multiple sources:
>>> this is the updated code:
>>> #include <variant>
>>> #include <iostream>
>>> #include <chrono>
>>> #include <ctime>
>>> #include <iomanip>
>>> #include<array>
>>> std::array<int, 3> array_1={1,2,3};
>>>
>>> struct A { int get() { return array_1[0]; } };
>>> struct B { int get() { return array_1[1]; } };
>>> struct C { int get() { return array_1[2]; } };
>>>
>>> struct data_accessed_through_visit {
>>> static std::variant<A, B, C> obj;
>>>
>>> inline int operator()(int) {
>>> return std::visit([](auto&& arg) {
>>> return arg.get();
>>> }, obj);
>>> }
>>> };
>>> std::variant<A, B, C> data_accessed_through_visit::obj=C{};
>>> int user_index = 0;
>>>
>>> struct data_ternary {
>>> inline int operator()(int index) {
>>> return (index == 0) ? array_1[0] : (index == 1) ? array_1[1] :
>>> (index == 1) ? array_1[2] : -1;
>>> }
>>> };
>>>
>>> struct data_switched {
>>> inline int operator()(int index) {
>>> switch(index) {
>>> case 0: return array_1[0];
>>> case 1: return array_1[1];
>>> case 2: return array_1[2];
>>> default: return -1;
>>> }
>>> }
>>> };
>>>
>>> struct data_indexing {
>>> inline int operator()(int index) {
>>> return array_1[index];
>>> }
>>> };
>>>
>>>
>>>
>>> volatile int x = 0;
>>> constexpr uint64_t loop_count=10000;
>>> static void measure_switch() {
>>> data_switched obj;
>>> for (int i=0; i++<loop_count;) {
>>> x = obj(user_index);
>>> }
>>> }
>>>
>>> static void measure_visit() {
>>> data_accessed_through_visit obj;
>>> for (int i=0; i++<loop_count;) {
>>> x = obj(user_index);
>>> }
>>> }
>>>
>>> static void measure_ternary() {
>>> data_ternary obj;
>>> for (int i=0; i++<loop_count;) {
>>> x = obj(user_index);
>>> }
>>> }
>>> static void measure_indexing() {
>>> data_indexing obj;
>>> for (int i=0; i++<loop_count;) {
>>> x = obj(user_index);
>>> }
>>> }
>>>
>>> template<typename func_t>
>>> void call_func(func_t callable_obj, int arg){
>>> const auto start = std::chrono::steady_clock::now();
>>>
>>> constexpr int how_much_to_loop=1000;
>>> for(int i=0; i++<how_much_to_loop;){
>>> callable_obj();
>>> }
>>> const auto end = std::chrono::steady_clock::now();
>>> auto result=
>>> std::chrono::duration_cast<std::chrono::nanoseconds>(end -
>>> start).count()/how_much_to_loop;
>>> std::cout<<result/how_much_to_loop<<std::endl;
>>>
>>> }
>>>
>>> int main() {
>>> std::cout << "Enter index (0 for A, 1 for B, 2 for C): ";
>>> if (!(std::cin >> user_index)) return 1;
>>>
>>> // Set the variant state
>>> if (user_index == 0) data_accessed_through_visit::obj = A{};
>>> else if (user_index == 1) data_accessed_through_visit::obj = B{};
>>> else if (user_index == 2) data_accessed_through_visit::obj = C{};
>>>
>>> std::cout << "Time (ns) for switch: ";
>>> call_func(measure_switch, user_index);
>>>
>>> std::cout << "Time (ns) for visit: ";
>>> call_func(measure_visit, user_index);
>>>
>>> std::cout << "Time (ns) for ternary: ";
>>> call_func(measure_ternary, user_index);
>>>
>>> std::cout << "Time (ns) for subscript: ";
>>> call_func(measure_indexing, user_index);
>>>
>>> return 0;
>>> }
>>> the bench marks consistently show that these syntax constructs do matter
>>> (the smaller the index range is, the more the compiler can flatten it and
>>> know how to branch), notice how ternary is outperforming them all even
>>> though its nesting, This means that adding new syntax with the sole purpose
>>> to give compilers as much information as possible is actually useful.
>>> Consider how templates and instantiation give the compiler extra insight.
>>> why? because templates are instantiated at the point of instantiation which
>>> can be delayed upto link time. these are the benchmarks:
>>> benchmarks for g++:
>>> Enter index (0 for A, 1 for B, 2 for C): 2
>>> Time (ns) for switch: 33
>>> Time (ns) for visit: 278
>>> Time (ns) for ternary: 19
>>> Time (ns) for subscript: 34
>>> PS C:\Users\drnoo\Downloads> .\a.exe
>>> Enter index (0 for A, 1 for B, 2 for C): 2
>>> Time (ns) for switch: 33
>>> Time (ns) for visit: 296
>>> Time (ns) for ternary: 20
>>> Time (ns) for subscript: 35
>>> PS C:\Users\drnoo\Downloads> .\a.exe
>>> Enter index (0 for A, 1 for B, 2 for C): 2
>>> Time (ns) for switch: 34
>>> Time (ns) for visit: 271
>>> Time (ns) for ternary: 17
>>> Time (ns) for subscript: 33
>>> PS C:\Users\drnoo\Downloads> .\a.exe
>>> Enter index (0 for A, 1 for B, 2 for C): 2
>>> Time (ns) for switch: 34
>>> Time (ns) for visit: 281
>>> Time (ns) for ternary: 19
>>> Time (ns) for subscript: 32
>>> PS C:\Users\drnoo\Downloads> .\a.exe
>>> Enter index (0 for A, 1 for B, 2 for C): 2
>>> Time (ns) for switch: 34
>>> Time (ns) for visit: 282
>>> Time (ns) for ternary: 20
>>> Time (ns) for subscript: 34
>>> I really have to go to sleep now ( I am having some issues with visual
>>> studio 2026), I Hope, it would be acceptable for me to send the benchmarks
>>> for that tomorrow.
>>>
>>> regards, Muneem
>>>
>>>
>>> On Thu, Apr 2, 2026 at 10:54 AM Thiago Macieira via Std-Proposals <
>>> std-proposals_at_[hidden]> wrote:
>>>
>>>> On Wednesday, 1 April 2026 21:56:44 Pacific Daylight Time Muneem via
>>>> Std-
>>>> Proposals wrote:
>>>> > /*
>>>> > Time (ns) for switch: 168100
>>>> > Time (ns) for visit: 3664100
>>>> > Time (ns) for ternary: 190900
>>>> > It keeps on getting worse!
>>>> > */
>>>>
>>>> So far you've maybe shown that one implementation is generating bad
>>>> code. Have
>>>> you tried others?
>>>>
>>>> You need to prove that this is an inherent and unavoidable problem of
>>>> the
>>>> requirements, not that it just happened to be bad for this
>>>> implementation.
>>>> Just quickly reading the proposed benchmark code, it would seem there's
>>>> no
>>>> such inherent reason and you're making an unfounded and probably
>>>> incorrect
>>>> assumption about how things actually work.
>>>>
>>>> In fact, I pasted a portion of your code into godbolt just to see what
>>>> the
>>>> variant visit code, which you claim to be unnecessarily slow, would
>>>> look like:
>>>> https://gcc.godbolt.org/z/WK5bMzcae
>>>>
>>>> The first thing to note in the GCC/libstdc++ pane is that it does not
>>>> use
>>>> user_index. The compiler thinks it's a constant, meaning this benchmark
>>>> is
>>>> faulty. And thus it has constant-propagated this value and is
>>>> *incredibly*
>>>> efficient in doing nothing useful. MSVC did likewise.
>>>>
>>>> Since MSVC outputs the out-of-line copy of inlined functions, we can
>>>> see the
>>>> operator() expansion without the proapagation of the user_index
>>>> constant. And
>>>> it's no different than what a ternary or switch would look like.
>>>>
>>>> In the Clang/libc++ pane, we see indirect function calls. I don't know
>>>> why
>>>> libc++ std::variant is implemented this way, but it could be why it is
>>>> slow
>>>> for you if you're using this implementation. If you tell Clang to
>>>> instead use
>>>> libstdc++ (remove the last argument of the command-line), the indirect
>>>> function call disappears and we see an unrolled loop of loading the
>>>> value 10.
>>>> That would mean Clang is even more efficient at doing nothing.
>>>>
>>>> Conclusion: it looks like your assumption that there is a problem to be
>>>> solved
>>>> is faulty. There is no problem.
>>>>
>>>> --
>>>> Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
>>>> Principal Engineer - Intel Data Center - Platform & Sys. Eng.
>>>> --
>>>> Std-Proposals mailing list
>>>> Std-Proposals_at_[hidden]
>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>
>>>
Received on 2026-04-02 07:17:04
