Date: Mon, 25 Aug 2025 13:45:32 +0200
On Mon, Aug 25, 2025 at 3:12 AM Adrian Johnston via Std-Proposals <
std-proposals_at_[hidden]> wrote:
> Hello,
>
> (If you spend a lot of time looking at generated assembly you might
> want to skip this one.)
>
> As we all know the compiler's budget for optimizing C++ is an
> implementation-defined metric where it knows best what is needed and
> we are not supposed to be doing the compiler's job for it.
>
That is not how I or many people view things. We know most of the time
compiler does very good job, but sometimes it fails.
I lost the code example, but recently I had an issues that compiler kept
calling .size() in a for loop
(... i < vec.size();...) although vector size never changed.
>
> I will make a couple proposals but let me explain the problems first.
>
> The keyword const never mattered to the code generator before and it
> seems the keyword constexpr doesn't matter now. The compiler may be
> willing to execute constexpr code in a manifestly-const context but it
> isn't going to do anything extra to optimize the generated assembly
> otherwise. This is bothersome because everyone learning C++ is going
> around acting like the constexpr keyword results in their runtime code
> being more thoroughly evaluated at compile time when it doesn't even
> matter at all.
This is not true. constexpr matters. Same for const. For example this code
without constexpr will not compile.
constexpr int fun() {
return 10;
}
int main() {
const int sz = fun();
std::array<int, sz> arr;
return arr.size();
}
> This is more of an education issue, but I wanted to
> point out how dangerous the wording around things being made "possible
> to evaluate at compile time" was for the uninitiated.
>
Keyword constexpr on functions and variables has different guarantees and I
agree that is confusing for people learning C++.
>
> Meanwhile gcc/clang are still perfectly happy to inline and execute
> arbitrary non-const code at link time clear across different
> translation units. Let me give you an example of code inlining using a
> simple insertion sort template that just uses pointers and operator<:
>
> int example1() {
> int x[3] = { 7, 3 };
> hxinsertion_sort(x+0, x+2);
> printf("%d %d", x[0], x[1]);
> }
>
> This results in the following assembly:
>
> .string "%d %d"
> sub rsp, 8
> mov edx, 7
> mov esi, 3
> xor eax, eax
> mov edi, OFFSET FLAT:.LC0
> call "printf"
>
> This means the two numbers were sorted at compile time without any of
> the new C++ constant evaluation machinery involved. Meanwhile, if you
> write that with std::sort you get this:
>
You can easily force sorting at compile time:
[[gnu::noinline]]
void example2() {
constexpr auto x =[] {
std::array arr{ 7, 3, 0};
std::ranges::sort(arr.data(), arr.data()+2);
return arr;
}();
printf("%d %d\n", x[0], x[1]);
}
example2():
lea rdi,[rip+0xe69] # 2010 <_IO_stdin_used+0x10>
xor esi,esi
mov edx,0x3
xor eax,eax
jmp 1030 <printf_at_plt>
example2():
mov edx, 3
xor esi, esi
mov edi, OFFSET FLAT:.LC0
xor eax, eax
jmp printf
https://godbolt.org/z/43WjbqvM3
// I shortened your array from 3 to 2 elements, but if you want you can
create 3 element std::array and then sort just first 2 elements, constexpr
works fine.
Interesting that gcc manages to do the optimization even for your example
with std::sort while clang fails.
Advantage of constexpr here is the following: when I see constexpr arr =
something
I know arr is computed at compile time. In real big projects where I do not
have time to look at asm for entire project this is a huge benefit in
productivity.
So, my first proposal is to have a template library that can operate
> using pointers and arrays without extra templated abstraction layers.
> Then the compiler does a better job, your compiler errors are nice and
> clean and the debugger is a relative joy to use. And when it comes to
> safety, the clang sanitizers can be used to make raw pointers just as
> safe as iterators these days.
>
I think you are missing important point about iterators. Can your pointer
based approach sort std::deque? std::sort can.
If hxsort can not sort std::deque you are basically asking for people to
have multiple implementations of std::sort.
Also insertion sort has worst case complexity not allowed in C++, and more
importantly for your example implementation is
much simpler than std::sort. So it may not be the iterators that are
causing compiler to fail to optimize, but the difference in complexity
of sort implementations.
> I have seen a professionally written C++ codebase spend 3% of
> its time inside std::vector::operator[] with all optimizations turned
> on. (This is why the standard library gets banned from real-time
> embedded projects.) If we required support for -O9 and told everyone
> they may have to let the compiler run overnight then at least all this
> talk about what is possible with compile time evaluation would be less
> deceptive.
While this idea is tempting I think this is not for C++ standard, more of
something that compiler people could do.
But I presume you are not first person to come up with this idea. My *guess
*(we would need to get definitive answer from compiler people) as to why
there are no flags like O9 is:
1. you can control this behavior with other flags
2. gains would be minimal, e.g. problem you described with sort is
because of compiler optimization issues, not because it did not had enough
time budget to get it correctly
std-proposals_at_[hidden]> wrote:
> Hello,
>
> (If you spend a lot of time looking at generated assembly you might
> want to skip this one.)
>
> As we all know the compiler's budget for optimizing C++ is an
> implementation-defined metric where it knows best what is needed and
> we are not supposed to be doing the compiler's job for it.
>
That is not how I or many people view things. We know most of the time
compiler does very good job, but sometimes it fails.
I lost the code example, but recently I had an issues that compiler kept
calling .size() in a for loop
(... i < vec.size();...) although vector size never changed.
>
> I will make a couple proposals but let me explain the problems first.
>
> The keyword const never mattered to the code generator before and it
> seems the keyword constexpr doesn't matter now. The compiler may be
> willing to execute constexpr code in a manifestly-const context but it
> isn't going to do anything extra to optimize the generated assembly
> otherwise. This is bothersome because everyone learning C++ is going
> around acting like the constexpr keyword results in their runtime code
> being more thoroughly evaluated at compile time when it doesn't even
> matter at all.
This is not true. constexpr matters. Same for const. For example this code
without constexpr will not compile.
constexpr int fun() {
return 10;
}
int main() {
const int sz = fun();
std::array<int, sz> arr;
return arr.size();
}
> This is more of an education issue, but I wanted to
> point out how dangerous the wording around things being made "possible
> to evaluate at compile time" was for the uninitiated.
>
Keyword constexpr on functions and variables has different guarantees and I
agree that is confusing for people learning C++.
>
> Meanwhile gcc/clang are still perfectly happy to inline and execute
> arbitrary non-const code at link time clear across different
> translation units. Let me give you an example of code inlining using a
> simple insertion sort template that just uses pointers and operator<:
>
> int example1() {
> int x[3] = { 7, 3 };
> hxinsertion_sort(x+0, x+2);
> printf("%d %d", x[0], x[1]);
> }
>
> This results in the following assembly:
>
> .string "%d %d"
> sub rsp, 8
> mov edx, 7
> mov esi, 3
> xor eax, eax
> mov edi, OFFSET FLAT:.LC0
> call "printf"
>
> This means the two numbers were sorted at compile time without any of
> the new C++ constant evaluation machinery involved. Meanwhile, if you
> write that with std::sort you get this:
>
You can easily force sorting at compile time:
[[gnu::noinline]]
void example2() {
constexpr auto x =[] {
std::array arr{ 7, 3, 0};
std::ranges::sort(arr.data(), arr.data()+2);
return arr;
}();
printf("%d %d\n", x[0], x[1]);
}
example2():
lea rdi,[rip+0xe69] # 2010 <_IO_stdin_used+0x10>
xor esi,esi
mov edx,0x3
xor eax,eax
jmp 1030 <printf_at_plt>
example2():
mov edx, 3
xor esi, esi
mov edi, OFFSET FLAT:.LC0
xor eax, eax
jmp printf
https://godbolt.org/z/43WjbqvM3
// I shortened your array from 3 to 2 elements, but if you want you can
create 3 element std::array and then sort just first 2 elements, constexpr
works fine.
Interesting that gcc manages to do the optimization even for your example
with std::sort while clang fails.
Advantage of constexpr here is the following: when I see constexpr arr =
something
I know arr is computed at compile time. In real big projects where I do not
have time to look at asm for entire project this is a huge benefit in
productivity.
So, my first proposal is to have a template library that can operate
> using pointers and arrays without extra templated abstraction layers.
> Then the compiler does a better job, your compiler errors are nice and
> clean and the debugger is a relative joy to use. And when it comes to
> safety, the clang sanitizers can be used to make raw pointers just as
> safe as iterators these days.
>
I think you are missing important point about iterators. Can your pointer
based approach sort std::deque? std::sort can.
If hxsort can not sort std::deque you are basically asking for people to
have multiple implementations of std::sort.
Also insertion sort has worst case complexity not allowed in C++, and more
importantly for your example implementation is
much simpler than std::sort. So it may not be the iterators that are
causing compiler to fail to optimize, but the difference in complexity
of sort implementations.
> I have seen a professionally written C++ codebase spend 3% of
> its time inside std::vector::operator[] with all optimizations turned
> on. (This is why the standard library gets banned from real-time
> embedded projects.) If we required support for -O9 and told everyone
> they may have to let the compiler run overnight then at least all this
> talk about what is possible with compile time evaluation would be less
> deceptive.
While this idea is tempting I think this is not for C++ standard, more of
something that compiler people could do.
But I presume you are not first person to come up with this idea. My *guess
*(we would need to get definitive answer from compiler people) as to why
there are no flags like O9 is:
1. you can control this behavior with other flags
2. gains would be minimal, e.g. problem you described with sort is
because of compiler optimization issues, not because it did not had enough
time budget to get it correctly
Received on 2025-08-25 11:45:46