Date: Tue, 26 Aug 2025 09:41:44 -0700
I’d like to move onto less unpopular proposals after this…
“No, you're misreading the assembly. The only "iterators" here are
pointers.”
I’m not claiming iterators are directly visible in the assembly. I am
claiming that they add inlining overhead.
I am also not saying that the compiler was insufficiently aggressive at
inlining. Actually, what I observed was that the cost of inlining was
deducted from the budget for the constant folding pass and so the constant
folding pass was getting cut short. In an era where constant folding is
being encouraged.
It seems the compiler is just willing to do more constant folding when you
tell it that the code base is more modern. I can see myself making that
same call too.
On Tuesday, August 26, 2025, Jonathan Wakely <cxx_at_[hidden]> wrote:
>
>
> On Mon, 25 Aug 2025 at 02:12, Adrian Johnston via Std-Proposals <
> std-proposals_at_[hidden]> wrote:
>
>> Hello,
>>
>> (If you spend a lot of time looking at generated assembly you might
>> want to skip this one.)
>>
>> As we all know the compiler's budget for optimizing C++ is an
>> implementation-defined metric where it knows best what is needed and
>> we are not supposed to be doing the compiler's job for it.
>> Unfortunately, this view has never really worked well in practice.
>> Recently I spent some time looking at the generated assembly from gcc
>> and clang and it doesn't look like the situation has improved in 30
>> years.
>>
>
> This is nonsense.
>
> Optimizations like autovectorization and conditional devirtualization do
> lots that wasn't possible 30 years ago. A problem is that many programs are
> much bigger than they were 30 years ago.
>
>
>
>> I will make a couple proposals but let me explain the problems first.
>>
>> The keyword const never mattered to the code generator before and it
>> seems the keyword constexpr doesn't matter now. The compiler may be
>> willing to execute constexpr code in a manifestly-const context but it
>> isn't going to do anything extra to optimize the generated assembly
>> otherwise. This is bothersome because everyone learning C++ is going
>> around acting like the constexpr keyword results in their runtime code
>> being more thoroughly evaluated at compile time when it doesn't even
>> matter at all. This is more of an education issue, but I wanted to
>> point out how dangerous the wording around things being made "possible
>> to evaluate at compile time" was for the uninitiated.
>>
>> Meanwhile gcc/clang are still perfectly happy to inline and execute
>> arbitrary non-const code at link time clear across different
>> translation units. Let me give you an example of code inlining using a
>> simple insertion sort template that just uses pointers and operator<:
>>
>> int example1() {
>> int x[3] = { 7, 3 };
>> hxinsertion_sort(x+0, x+2);
>> printf("%d %d", x[0], x[1]);
>> }
>>
>> This results in the following assembly:
>>
>> .string "%d %d"
>> sub rsp, 8
>> mov edx, 7
>> mov esi, 3
>> xor eax, eax
>> mov edi, OFFSET FLAT:.LC0
>> call "printf"
>>
>> This means the two numbers were sorted at compile time without any of
>> the new C++ constant evaluation machinery involved. Meanwhile, if you
>> write that with std::sort you get this:
>>
>> .string "%d %d"
>> movabs rax, 12884901895
>> sub rsp, 24
>> mov rdi, rsp
>> lea rsi, [rsp+8]
>> mov QWORD PTR [rsp], rax
>> mov DWORD PTR [rsp+8], 0
>> call "void std::__insertion_sort<int*,
>> __gnu_cxx::__ops::_Iter_less_iter>(int*, int*,
>> __gnu_cxx::__ops::_Iter_less_iter) [clone .isra.0]"
>> mov edx, DWORD PTR [rsp+4]
>> mov esi, DWORD PTR [rsp]
>> xor eax, eax
>> mov edi, OFFSET FLAT:.LC1
>> call "printf"
>>
>> It appears that the real problem here is actually that the standard
>> template library uses iterators.
>
>
> No, you're misreading the assembly. The only "iterators" here are
> pointers. The word "iter" in a symbol name doesn't mean there's an iterator
> class involved.
>
>
>
>> Sacrilege you say? Well, we know the
>> compiler has an optimization budget and we know that inlining does
>> require some work by the compiler.... I would argue that it is
>> impressive that the compiler was able to call insertion sort directly
>> without invoking a partitioning scheme first.
>>
>> So, my first proposal is to have a template library that can operate
>> using pointers and arrays without extra templated abstraction layers.
>>
>
> That already exists.
>
>
>>
>>
“No, you're misreading the assembly. The only "iterators" here are
pointers.”
I’m not claiming iterators are directly visible in the assembly. I am
claiming that they add inlining overhead.
I am also not saying that the compiler was insufficiently aggressive at
inlining. Actually, what I observed was that the cost of inlining was
deducted from the budget for the constant folding pass and so the constant
folding pass was getting cut short. In an era where constant folding is
being encouraged.
It seems the compiler is just willing to do more constant folding when you
tell it that the code base is more modern. I can see myself making that
same call too.
On Tuesday, August 26, 2025, Jonathan Wakely <cxx_at_[hidden]> wrote:
>
>
> On Mon, 25 Aug 2025 at 02:12, Adrian Johnston via Std-Proposals <
> std-proposals_at_[hidden]> wrote:
>
>> Hello,
>>
>> (If you spend a lot of time looking at generated assembly you might
>> want to skip this one.)
>>
>> As we all know the compiler's budget for optimizing C++ is an
>> implementation-defined metric where it knows best what is needed and
>> we are not supposed to be doing the compiler's job for it.
>> Unfortunately, this view has never really worked well in practice.
>> Recently I spent some time looking at the generated assembly from gcc
>> and clang and it doesn't look like the situation has improved in 30
>> years.
>>
>
> This is nonsense.
>
> Optimizations like autovectorization and conditional devirtualization do
> lots that wasn't possible 30 years ago. A problem is that many programs are
> much bigger than they were 30 years ago.
>
>
>
>> I will make a couple proposals but let me explain the problems first.
>>
>> The keyword const never mattered to the code generator before and it
>> seems the keyword constexpr doesn't matter now. The compiler may be
>> willing to execute constexpr code in a manifestly-const context but it
>> isn't going to do anything extra to optimize the generated assembly
>> otherwise. This is bothersome because everyone learning C++ is going
>> around acting like the constexpr keyword results in their runtime code
>> being more thoroughly evaluated at compile time when it doesn't even
>> matter at all. This is more of an education issue, but I wanted to
>> point out how dangerous the wording around things being made "possible
>> to evaluate at compile time" was for the uninitiated.
>>
>> Meanwhile gcc/clang are still perfectly happy to inline and execute
>> arbitrary non-const code at link time clear across different
>> translation units. Let me give you an example of code inlining using a
>> simple insertion sort template that just uses pointers and operator<:
>>
>> int example1() {
>> int x[3] = { 7, 3 };
>> hxinsertion_sort(x+0, x+2);
>> printf("%d %d", x[0], x[1]);
>> }
>>
>> This results in the following assembly:
>>
>> .string "%d %d"
>> sub rsp, 8
>> mov edx, 7
>> mov esi, 3
>> xor eax, eax
>> mov edi, OFFSET FLAT:.LC0
>> call "printf"
>>
>> This means the two numbers were sorted at compile time without any of
>> the new C++ constant evaluation machinery involved. Meanwhile, if you
>> write that with std::sort you get this:
>>
>> .string "%d %d"
>> movabs rax, 12884901895
>> sub rsp, 24
>> mov rdi, rsp
>> lea rsi, [rsp+8]
>> mov QWORD PTR [rsp], rax
>> mov DWORD PTR [rsp+8], 0
>> call "void std::__insertion_sort<int*,
>> __gnu_cxx::__ops::_Iter_less_iter>(int*, int*,
>> __gnu_cxx::__ops::_Iter_less_iter) [clone .isra.0]"
>> mov edx, DWORD PTR [rsp+4]
>> mov esi, DWORD PTR [rsp]
>> xor eax, eax
>> mov edi, OFFSET FLAT:.LC1
>> call "printf"
>>
>> It appears that the real problem here is actually that the standard
>> template library uses iterators.
>
>
> No, you're misreading the assembly. The only "iterators" here are
> pointers. The word "iter" in a symbol name doesn't mean there's an iterator
> class involved.
>
>
>
>> Sacrilege you say? Well, we know the
>> compiler has an optimization budget and we know that inlining does
>> require some work by the compiler.... I would argue that it is
>> impressive that the compiler was able to call insertion sort directly
>> without invoking a partitioning scheme first.
>>
>> So, my first proposal is to have a template library that can operate
>> using pointers and arrays without extra templated abstraction layers.
>>
>
> That already exists.
>
>
>>
>>
Received on 2025-08-26 16:41:49