C++ Logo

std-proposals

Advanced search

Re: [std-proposals] The real problems with optimizing C++ are not getting better

From: Bo Persson <bo_at_[hidden]>
Date: Mon, 25 Aug 2025 05:52:30 +0200
On 2025-08-25 at 03:12, Adrian Johnston via Std-Proposals wrote:
> Hello,
>
> (If you spend a lot of time looking at generated assembly you might
> want to skip this one.)
>
> As we all know the compiler's budget for optimizing C++ is an
> implementation-defined metric where it knows best what is needed and
> we are not supposed to be doing the compiler's job for it.
> Unfortunately, this view has never really worked well in practice.
> Recently I spent some time looking at the generated assembly from gcc
> and clang and it doesn't look like the situation has improved in 30
> years.
>
> I will make a couple proposals but let me explain the problems first.
>
> The keyword const never mattered to the code generator before and it
> seems the keyword constexpr doesn't matter now. The compiler may be
> willing to execute constexpr code in a manifestly-const context but it
> isn't going to do anything extra to optimize the generated assembly
> otherwise. This is bothersome because everyone learning C++ is going
> around acting like the constexpr keyword results in their runtime code
> being more thoroughly evaluated at compile time when it doesn't even
> matter at all. This is more of an education issue, but I wanted to
> point out how dangerous the wording around things being made "possible
> to evaluate at compile time" was for the uninitiated.
>
> Meanwhile gcc/clang are still perfectly happy to inline and execute
> arbitrary non-const code at link time clear across different
> translation units. Let me give you an example of code inlining using a
> simple insertion sort template that just uses pointers and operator<:
>
> int example1() {
> int x[3] = { 7, 3 };
> hxinsertion_sort(x+0, x+2);
> printf("%d %d", x[0], x[1]);
> }
>
> This results in the following assembly:
>
> .string "%d %d"
> sub rsp, 8
> mov edx, 7
> mov esi, 3
> xor eax, eax
> mov edi, OFFSET FLAT:.LC0
> call "printf"
>
> This means the two numbers were sorted at compile time without any of
> the new C++ constant evaluation machinery involved. Meanwhile, if you
> write that with std::sort you get this:
>
> .string "%d %d"
> movabs rax, 12884901895
> sub rsp, 24
> mov rdi, rsp
> lea rsi, [rsp+8]
> mov QWORD PTR [rsp], rax
> mov DWORD PTR [rsp+8], 0
> call "void std::__insertion_sort<int*,
> __gnu_cxx::__ops::_Iter_less_iter>(int*, int*,
> __gnu_cxx::__ops::_Iter_less_iter) [clone .isra.0]"
> mov edx, DWORD PTR [rsp+4]
> mov esi, DWORD PTR [rsp]
> xor eax, eax
> mov edi, OFFSET FLAT:.LC1
> call "printf"
>
> It appears that the real problem here is actually that the standard
> template library uses iterators.
Or std::sort might not be optimal for sorting 2 numbers. I would look
for std::min and std::max before designing a new library.

Received on 2025-08-25 03:52:36