C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Fwd: Extension to runtime polymorphism proposed

From: Breno Guimarães <brenorg_at_[hidden]>
Date: Thu, 2 Apr 2026 17:23:41 -0300
Where is the code? I understand the description, but I want to see how you
envision that would look like as actual C++ code.

Em qui., 2 de abr. de 2026, 17:18, Muneem via Std-Proposals <
std-proposals_at_[hidden]> escreveu:

> Hi!
> Thank you for your feedback ❤️❤️❤️.
> Sorry if some of the things that I said were incorrect, but my
> philosophical point was that compilers are layered models that need a
> language to consume intent and context so they product proper code, that's
> just how it should be, why we use compiled languages.
> On the question of how the syntax will look like:
> It would have the same semantics as an std::array(runtime indexing, and a
> template instantion for every collection that it is instantiated for), but
> is homegenous. (I called it half open in the previous mails because that's
> what Stroustrup calls sequential containers). To index, you can use an
> integer or any type that converts to one implicitly or explicitly(just like
> any other half open container types). To implement this container, the
> language would have to provide either one of these facilities on a language
> level:
> 1.Indexing user-defined objects or providing a new construct like
> struct/class that can be indexes.
> 2. Providing homogenous lists (unlikely to be implemented).
> 3. Providing a specific kind of function which can have different return
> types based on the index passed:
> {Int_obj, double_obj, float_obj} return_indexed_value(any integer that ful
> fills the is integral concept index);
> //The compiler deduces the body itself.
> A cherry on the top would be to allow any class to be based for
> indexing(those classes must have a particular function for providing the
> integer), so that you can give compilers an insight to the fact that you
> are potentially doing multiple subscript operations in the same chain, so
> the compiler can optimize that chain. For example:
> List[list[1]] the compiler can figure out ways to make this as cache
> friendly as possible.
> (This feature would probably work the same way if you provide an operator
> that converts the type to int, but again, this feature would make the
> intent more clearer).
>
> On Fri, Apr 3, 2026 at 1:17 AM Muneem <itfllow123_at_[hidden]> wrote:
>
>> Hi!
>> Thank you for your feedback ❤️❤️❤️.
>> Sorry if some of the things that I said were incorrect, but my
>> philosophical point was that compilers are layered models that need a
>> language to consume intent and context so they product proper code, that's
>> just how it should be, why we use compiled languages.
>> On the question of how the syntax will look like:
>> It would have the same semantics as an std::array(runtime indexing, and a
>> template instantion for every collection that it is instantiated for), but
>> is homegenous. (I called it half open in the previous mails because that's
>> what Stroustrup calls sequential containers). To index, you can use an
>> integer or any type that converts to one implicitly or explicitly(just like
>> any other half open container types). To implement this container, the
>> language would have to provide either one of these facilities on a language
>> level:
>> 1.Indexing user-defined objects or providing a new construct like
>> struct/class that can be indexes.
>> 2. Providing homogenous lists (unlikely to be implemented).
>> 3. Providing a specific kind of function which can have different return
>> types based on the index passed:
>> {Int_obj, double_obj, float_obj} return_indexed_value(any integer that
>> ful fills the is integral concept index);
>> //The compiler deduces the body itself.
>> A cherry on the top would be to allow any class to be based for
>> indexing(those classes must have a particular function for providing the
>> integer), so that you can give compilers an insight to the fact that you
>> are potentially doing multiple subscript operations in the same chain, so
>> the compiler can optimize that chain. For example:
>> List[list[1]] the compiler can figure out ways to make this as cache
>> friendly as possible.
>> (This feature would probably work the same way if you provide an operator
>> that converts the type to int, but again, this feature would make the
>> intent more clearer).
>>
>>
>>
>> On Fri, 3 Apr 2026, 12:43 am Breno Guimarães, <brenorg_at_[hidden]> wrote:
>>
>>> Hi Muneem,
>>>
>>> Can you post how the code would look like with the feature you are
>>> proposing?
>>>
>>> You don't need to try to teach everyone about how compilers work. Some
>>> of the things you say are just incorrect, so this will drag on.
>>>
>>> You showed the benchmark showing different implementations have
>>> different times. Great.
>>>
>>> How would the new improved version look like?
>>> It doesn't need to compile.
>>>
>>> Thanks!
>>>
>>> Em qui., 2 de abr. de 2026, 16:32, Muneem via Std-Proposals <
>>> std-proposals_at_[hidden]> escreveu:
>>>
>>>> >what? Did you confuse it with Link Time Optimization? instantiation is
>>>> >in the Transition Unit not at link time as this would be too late.
>>>> >We have Extern Templates but then the compiler in TU becomes blind to
>>>> that
>>>> >code as it instantiates in other TU, LTO can help there but
>>>> >then we have exactly the same case if we use normal functions.
>>>> It's different for templates because the linker has to plug in some
>>>> type, which again gives it more flexibility since it knows the INTENT IS TO
>>>> PLUGIN types. current std::variants do rely on templates, but it's not good
>>>> enough for subscripting, like my benchmarking clearly shows this
>>>> consistently. LIke how am I supposed to subscript homogenous lists when the
>>>> speed of branching is not fast enough. Why is assembly code outperformed by
>>>> C++ code (for large code bases)? because C++ code has constructs that give
>>>> compiler intent and context, which helps the compiler do automated
>>>> optimizations). There is no current tool, like the current tools can be
>>>> used for this until there is a construct to allow for runtime indexing of
>>>> homogenous half open sets.
>>>>
>>>> One last detail:
>>>> tools do exist and I tried them, but they dont convey intent, like
>>>> again, I can make my own tools but they won't convey intent, hence won't be
>>>> fast enough. Like why do I use the inline keyword, when the compiler should
>>>> in theory inline any function? Why do I use the constexpr keyword, if the
>>>> compiler can evaluate any thing that It can at compile time if the
>>>> observable behaviour does not change. It's a very simple point that I am
>>>> trying to build on.
>>>>
>>>>
>>>> On Fri, Apr 3, 2026 at 12:28 AM Muneem <itfllow123_at_[hidden]> wrote:
>>>>
>>>>> >what? Did you confuse it with Link Time Optimization? instantiation is
>>>>> >in the Transition Unit not at link time as this would be too late.
>>>>> >We have Extern Templates but then the compiler in TU becomes blind to
>>>>> that
>>>>> >code as it instantiates in other TU, LTO can help there but
>>>>> >then we have exactly the same case if we use normal functions.
>>>>> It's different for templates because the linker has to plug in some
>>>>> type, which again gives it more flexibility since it knows the INTENT IS TO
>>>>> PLUGIN types. current std::variants do rely on templates, but it's not good
>>>>> enough for subscripting, like my benchmarking clearly shows this
>>>>> consistently. LIke how am I supposed to subscript homogenous lists when the
>>>>> speed of branching is not fast enough. Why is assembly code outperformed by
>>>>> C++ code (for large code bases)? because C++ code has constructs that give
>>>>> compiler intent and context, which helps the compiler do automated
>>>>> optimizations). There is no current tool, like the current tools can be
>>>>> used for this until there is a construct to allow for runtime indexing of
>>>>> homogenous half open sets.
>>>>>
>>>>>
>>>>> On Thu, Apr 2, 2026 at 1:11 PM Marcin Jaczewski <
>>>>> marcinjaczewski86_at_[hidden]> wrote:
>>>>>
>>>>>> czw., 2 kwi 2026 o 09:05 Muneem via Std-Proposals
>>>>>> <std-proposals_at_[hidden]> napisał(a):
>>>>>> >
>>>>>> > hi!
>>>>>> > Your point is partly correct, but this issue is quite prevalent,
>>>>>> below are my branches from multiple sources:
>>>>>> > this is the updated code:
>>>>>> > #include <variant>
>>>>>> > #include <iostream>
>>>>>> > #include <chrono>
>>>>>> > #include <ctime>
>>>>>> > #include <iomanip>
>>>>>> > #include<array>
>>>>>> > std::array<int, 3> array_1={1,2,3};
>>>>>> >
>>>>>> > struct A { int get() { return array_1[0]; } };
>>>>>> > struct B { int get() { return array_1[1]; } };
>>>>>> > struct C { int get() { return array_1[2]; } };
>>>>>> >
>>>>>> > struct data_accessed_through_visit {
>>>>>> > static std::variant<A, B, C> obj;
>>>>>> >
>>>>>> > inline int operator()(int) {
>>>>>> > return std::visit([](auto&& arg) {
>>>>>> > return arg.get();
>>>>>> > }, obj);
>>>>>> > }
>>>>>> > };
>>>>>> > std::variant<A, B, C> data_accessed_through_visit::obj=C{};
>>>>>> > int user_index = 0;
>>>>>>
>>>>>> I can hold you right now, you use a static object for the
>>>>>> `std::variant`, this means you test different thing than in other
>>>>>> cases.
>>>>>> This can affect code gen and what optimizer can see and guarantee.
>>>>>>
>>>>>> Besides, why use `variant` here? It was not designed for this but to
>>>>>> store different types in the same memory location.
>>>>>> And none of this is used there, only making it harder to compiler to
>>>>>> see what is going on.
>>>>>>
>>>>>> Why not use simple pointers there? And if you need to do some
>>>>>> calculations function pointers?
>>>>>>
>>>>>> >
>>>>>> > struct data_ternary {
>>>>>> > inline int operator()(int index) {
>>>>>> > return (index == 0) ? array_1[0] : (index == 1) ?
>>>>>> array_1[1] : (index == 1) ? array_1[2] : -1;
>>>>>> > }
>>>>>> > };
>>>>>> >
>>>>>> > struct data_switched {
>>>>>> > inline int operator()(int index) {
>>>>>> > switch(index) {
>>>>>> > case 0: return array_1[0];
>>>>>> > case 1: return array_1[1];
>>>>>> > case 2: return array_1[2];
>>>>>> > default: return -1;
>>>>>> > }
>>>>>> > }
>>>>>> > };
>>>>>> >
>>>>>> > struct data_indexing {
>>>>>> > inline int operator()(int index) {
>>>>>> > return array_1[index];
>>>>>> > }
>>>>>> > };
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > volatile int x = 0;
>>>>>> > constexpr uint64_t loop_count=10000;
>>>>>> > static void measure_switch() {
>>>>>> > data_switched obj;
>>>>>> > for (int i=0; i++<loop_count;) {
>>>>>> > x = obj(user_index);
>>>>>> > }
>>>>>> > }
>>>>>> >
>>>>>> > static void measure_visit() {
>>>>>> > data_accessed_through_visit obj;
>>>>>> > for (int i=0; i++<loop_count;) {
>>>>>> > x = obj(user_index);
>>>>>> > }
>>>>>> > }
>>>>>>
>>>>>> Why not use:
>>>>>> ```
>>>>>> static void measure_visit() {
>>>>>> data_accessed_through_visit obj
>>>>>> return std::visit([](auto&& arg) {
>>>>>> for (int i=0; i++<loop_count;) {
>>>>>> x = arg.get();
>>>>>> }
>>>>>> }, obj);
>>>>>> }
>>>>>> ```
>>>>>> And this is more similar to what the compiler sees in other test
>>>>>> cases.
>>>>>>
>>>>>> >
>>>>>> > static void measure_ternary() {
>>>>>> > data_ternary obj;
>>>>>> > for (int i=0; i++<loop_count;) {
>>>>>> > x = obj(user_index);
>>>>>> > }
>>>>>> > }
>>>>>> > static void measure_indexing() {
>>>>>> > data_indexing obj;
>>>>>> > for (int i=0; i++<loop_count;) {
>>>>>> > x = obj(user_index);
>>>>>> > }
>>>>>> > }
>>>>>> >
>>>>>> > template<typename func_t>
>>>>>> > void call_func(func_t callable_obj, int arg){
>>>>>> > const auto start = std::chrono::steady_clock::now();
>>>>>> >
>>>>>> > constexpr int how_much_to_loop=1000;
>>>>>> > for(int i=0; i++<how_much_to_loop;){
>>>>>> > callable_obj();
>>>>>> > }
>>>>>> > const auto end = std::chrono::steady_clock::now();
>>>>>> > auto result=
>>>>>> std::chrono::duration_cast<std::chrono::nanoseconds>(end -
>>>>>> start).count()/how_much_to_loop;
>>>>>> > std::cout<<result/how_much_to_loop<<std::endl;
>>>>>> >
>>>>>> > }
>>>>>> >
>>>>>> > int main() {
>>>>>> > std::cout << "Enter index (0 for A, 1 for B, 2 for C): ";
>>>>>> > if (!(std::cin >> user_index)) return 1;
>>>>>> >
>>>>>> > // Set the variant state
>>>>>> > if (user_index == 0) data_accessed_through_visit::obj = A{};
>>>>>> > else if (user_index == 1) data_accessed_through_visit::obj =
>>>>>> B{};
>>>>>> > else if (user_index == 2) data_accessed_through_visit::obj =
>>>>>> C{};
>>>>>> >
>>>>>> > std::cout << "Time (ns) for switch: ";
>>>>>> > call_func(measure_switch, user_index);
>>>>>> >
>>>>>> > std::cout << "Time (ns) for visit: ";
>>>>>> > call_func(measure_visit, user_index);
>>>>>> >
>>>>>> > std::cout << "Time (ns) for ternary: ";
>>>>>> > call_func(measure_ternary, user_index);
>>>>>> >
>>>>>> > std::cout << "Time (ns) for subscript: ";
>>>>>> > call_func(measure_indexing, user_index);
>>>>>> >
>>>>>> > return 0;
>>>>>> > }
>>>>>> > the bench marks consistently show that these syntax constructs do
>>>>>> matter (the smaller the index range is, the more the compiler can flatten
>>>>>> it and know how to branch), notice how ternary is outperforming them all
>>>>>> even though its nesting, This means that adding new syntax with the sole
>>>>>> purpose to give compilers as much information as possible is actually
>>>>>> useful.
>>>>>>
>>>>>> And what do you propose here exactly? diffrent structures behave
>>>>>> differently.
>>>>>> How exactly it will look and work, right now you use lot of vague
>>>>>> statement that do not show anything.
>>>>>>
>>>>>> > Consider how templates and instantiation give the compiler extra
>>>>>> insight. why? because templates are instantiated at the point of
>>>>>> instantiation which can be delayed upto link time. these are the benchmarks:
>>>>>>
>>>>>> what? did you confuse it with Link Time Optimization? instantiation is
>>>>>> in the Transition Unit not at link time as this would be too late.
>>>>>> We have Extern Templates but then compiler in TU becomes blind to that
>>>>>> code as it instantiationed in other TU, LTO can help there but
>>>>>> then we have exactly the same case if we use normal functions.
>>>>>>
>>>>>> > benchmarks for g++:
>>>>>> > Enter index (0 for A, 1 for B, 2 for C): 2
>>>>>> > Time (ns) for switch: 33
>>>>>> > Time (ns) for visit: 278
>>>>>> > Time (ns) for ternary: 19
>>>>>> > Time (ns) for subscript: 34
>>>>>> > PS C:\Users\drnoo\Downloads> .\a.exe
>>>>>> > Enter index (0 for A, 1 for B, 2 for C): 2
>>>>>> > Time (ns) for switch: 33
>>>>>> > Time (ns) for visit: 296
>>>>>> > Time (ns) for ternary: 20
>>>>>> > Time (ns) for subscript: 35
>>>>>> > PS C:\Users\drnoo\Downloads> .\a.exe
>>>>>> > Enter index (0 for A, 1 for B, 2 for C): 2
>>>>>> > Time (ns) for switch: 34
>>>>>> > Time (ns) for visit: 271
>>>>>> > Time (ns) for ternary: 17
>>>>>> > Time (ns) for subscript: 33
>>>>>> > PS C:\Users\drnoo\Downloads> .\a.exe
>>>>>> > Enter index (0 for A, 1 for B, 2 for C): 2
>>>>>> > Time (ns) for switch: 34
>>>>>> > Time (ns) for visit: 281
>>>>>> > Time (ns) for ternary: 19
>>>>>> > Time (ns) for subscript: 32
>>>>>> > PS C:\Users\drnoo\Downloads> .\a.exe
>>>>>> > Enter index (0 for A, 1 for B, 2 for C): 2
>>>>>> > Time (ns) for switch: 34
>>>>>> > Time (ns) for visit: 282
>>>>>> > Time (ns) for ternary: 20
>>>>>> > Time (ns) for subscript: 34
>>>>>> > I really have to go to sleep now ( I am having some issues with
>>>>>> visual studio 2026), I Hope, it would be acceptable for me to send the
>>>>>> benchmarks for that tomorrow.
>>>>>> >
>>>>>> > regards, Muneem
>>>>>> >
>>>>>> >
>>>>>> > On Thu, Apr 2, 2026 at 10:54 AM Thiago Macieira via Std-Proposals <
>>>>>> std-proposals_at_[hidden]> wrote:
>>>>>> >>
>>>>>> >> On Wednesday, 1 April 2026 21:56:44 Pacific Daylight Time Muneem
>>>>>> via Std-
>>>>>> >> Proposals wrote:
>>>>>> >> > /*
>>>>>> >> > Time (ns) for switch: 168100
>>>>>> >> > Time (ns) for visit: 3664100
>>>>>> >> > Time (ns) for ternary: 190900
>>>>>> >> > It keeps on getting worse!
>>>>>> >> > */
>>>>>> >>
>>>>>> >> So far you've maybe shown that one implementation is generating
>>>>>> bad code. Have
>>>>>> >> you tried others?
>>>>>> >>
>>>>>> >> You need to prove that this is an inherent and unavoidable problem
>>>>>> of the
>>>>>> >> requirements, not that it just happened to be bad for this
>>>>>> implementation.
>>>>>> >> Just quickly reading the proposed benchmark code, it would seem
>>>>>> there's no
>>>>>> >> such inherent reason and you're making an unfounded and probably
>>>>>> incorrect
>>>>>> >> assumption about how things actually work.
>>>>>> >>
>>>>>> >> In fact, I pasted a portion of your code into godbolt just to see
>>>>>> what the
>>>>>> >> variant visit code, which you claim to be unnecessarily slow,
>>>>>> would look like:
>>>>>> >> https://gcc.godbolt.org/z/WK5bMzcae
>>>>>> >>
>>>>>> >> The first thing to note in the GCC/libstdc++ pane is that it does
>>>>>> not use
>>>>>> >> user_index. The compiler thinks it's a constant, meaning this
>>>>>> benchmark is
>>>>>> >> faulty. And thus it has constant-propagated this value and is
>>>>>> *incredibly*
>>>>>> >> efficient in doing nothing useful. MSVC did likewise.
>>>>>> >>
>>>>>> >> Since MSVC outputs the out-of-line copy of inlined functions, we
>>>>>> can see the
>>>>>> >> operator() expansion without the proapagation of the user_index
>>>>>> constant. And
>>>>>> >> it's no different than what a ternary or switch would look like.
>>>>>> >>
>>>>>> >> In the Clang/libc++ pane, we see indirect function calls. I don't
>>>>>> know why
>>>>>> >> libc++ std::variant is implemented this way, but it could be why
>>>>>> it is slow
>>>>>> >> for you if you're using this implementation. If you tell Clang to
>>>>>> instead use
>>>>>> >> libstdc++ (remove the last argument of the command-line), the
>>>>>> indirect
>>>>>> >> function call disappears and we see an unrolled loop of loading
>>>>>> the value 10.
>>>>>> >> That would mean Clang is even more efficient at doing nothing.
>>>>>> >>
>>>>>> >> Conclusion: it looks like your assumption that there is a problem
>>>>>> to be solved
>>>>>> >> is faulty. There is no problem.
>>>>>> >>
>>>>>> >> --
>>>>>> >> Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
>>>>>> >> Principal Engineer - Intel Data Center - Platform & Sys. Eng.
>>>>>> >> --
>>>>>> >> Std-Proposals mailing list
>>>>>> >> Std-Proposals_at_[hidden]
>>>>>> >> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>>> >
>>>>>> > --
>>>>>> > Std-Proposals mailing list
>>>>>> > Std-Proposals_at_[hidden]
>>>>>> > https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>>>
>>>>> --
>>>> Std-Proposals mailing list
>>>> Std-Proposals_at_[hidden]
>>>> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>>>>
>>> --
> Std-Proposals mailing list
> Std-Proposals_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>

Received on 2026-04-02 20:23:59