<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote gmail_quote_container"><div dir="ltr" class="gmail_attr">On Mon, Aug 25, 2025 at 3:12 AM Adrian Johnston via Std-Proposals &lt;<a href="mailto:std-proposals@lists.isocpp.org">std-proposals@lists.isocpp.org</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hello,<br>
<br>
(If you spend a lot of time looking at generated assembly you might<br>
want to skip this one.)<br>
<br>
As we all know the compiler&#39;s budget for optimizing C++ is an<br>
implementation-defined metric where it knows best what is needed and<br>
we are not supposed to be doing the compiler&#39;s job for it.<br></blockquote><div><br></div><div>That is not how I or many people view things. We know most of the time compiler does very good job, but sometimes it fails.</div><div>I lost the code example, but recently I had an issues that compiler kept calling .size() in a for loop</div><div>(... i &lt; vec.size();...) although vector size never changed. </div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
I will make a couple proposals but let me explain the problems first.<br>
<br>
The keyword const never mattered to the code generator before and it<br>
seems the keyword constexpr doesn&#39;t matter now. The compiler may be<br>
willing to execute constexpr code in a manifestly-const context but it<br>
isn&#39;t going to do anything extra to optimize the generated assembly<br>
otherwise. This is bothersome because everyone learning C++ is going<br>
around acting like the constexpr keyword results in their runtime code<br>
being more thoroughly evaluated at compile time when it doesn&#39;t even<br>
matter at all. </blockquote><div><br></div><div>This is not true. constexpr matters. Same for const. For example this code without constexpr will not compile.<br><font face="monospace">constexpr int fun() {<br>   return 10;<br>}<br><br>int main() {<br>    const int sz = fun();<br>    std::array&lt;int, sz&gt; arr;<br>    return arr.size();<br>}</font></div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">This is more of an education issue, but I wanted to<br>
point out how dangerous the wording around things being made &quot;possible<br>
to evaluate at compile time&quot; was for the uninitiated.<br></blockquote><div><br></div><div><div>Keyword constexpr on functions and variables has different guarantees and I agree that is confusing for people learning C++.</div></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
Meanwhile gcc/clang are still perfectly happy to inline and execute<br>
arbitrary non-const code at link time clear across different<br>
translation units. Let me give you an example of code inlining using a<br>
simple insertion sort template that just uses pointers and operator&lt;:<br>
<br>
int example1() {<br>
   int x[3] = { 7, 3 };<br>
   hxinsertion_sort(x+0, x+2);<br>
   printf(&quot;%d %d&quot;, x[0], x[1]);<br>
}<br>
<br>
This results in the following assembly:<br>
<br>
       .string &quot;%d %d&quot;<br>
       sub     rsp, 8<br>
       mov     edx, 7<br>
       mov     esi, 3<br>
       xor     eax, eax<br>
       mov     edi, OFFSET FLAT:.LC0<br>
       call    &quot;printf&quot;<br>
<br>
This means the two numbers were sorted at compile time without any of<br>
the new C++ constant evaluation machinery involved. Meanwhile, if you<br>
write that with std::sort you get this:<br></blockquote><div> </div><div>You can easily force sorting at compile time:</div><div><div style="color:rgb(212,212,212);background-color:rgb(30,30,30);font-family:Consolas,&quot;Liberation Mono&quot;,Courier,monospace,&quot;Droid Sans Mono&quot;,&quot;monospace&quot;,monospace;font-size:14px;line-height:19px;white-space:pre"><div><span style="color:rgb(204,102,102)">[[gnu</span><span style="color:rgb(220,220,220)">::</span><span style="color:rgb(204,102,102)">noinline]]</span></div><div><span style="color:rgb(86,156,214)">void</span> example2<span style="color:rgb(220,220,220)">()</span> <span style="color:rgb(220,220,220)">{</span></div><div>   <span style="color:rgb(86,156,214)">constexpr</span> <span style="color:rgb(86,156,214)">auto</span> x <span style="color:rgb(220,220,220)">=[]</span> <span style="color:rgb(220,220,220)">{</span></div><div>      std::array arr<span style="color:rgb(220,220,220)">{</span> <span style="color:rgb(181,206,168)">7</span><span style="color:rgb(220,220,220)">,</span> <span style="color:rgb(181,206,168)">3</span><span style="color:rgb(220,220,220)">,</span> <span style="color:rgb(181,206,168)">0</span><span style="color:rgb(220,220,220)">};</span></div><div>      std::ranges::sort<span style="color:rgb(220,220,220)">(</span>arr<span style="color:rgb(220,220,220)">.</span>data<span style="color:rgb(220,220,220)">(),</span> arr<span style="color:rgb(220,220,220)">.</span>data<span style="color:rgb(220,220,220)">()+</span><span style="color:rgb(181,206,168)">2</span><span style="color:rgb(220,220,220)">);</span></div><div>      <span style="color:rgb(86,156,214)">return</span> arr<span style="color:rgb(220,220,220)">;</span></div><div>   <span style="color:rgb(220,220,220)">}();</span></div><div>   printf<span style="color:rgb(220,220,220)">(</span><span style="color:rgb(206,145,120)">&quot;%d %d\n&quot;</span><span style="color:rgb(220,220,220)">,</span> x<span style="color:rgb(220,220,220)">[</span><span style="color:rgb(181,206,168)">0</span><span style="color:rgb(220,220,220)">],</span> x<span style="color:rgb(220,220,220)">[</span><span style="color:rgb(181,206,168)">1</span><span style="color:rgb(220,220,220)">]);</span></div><div><span style="color:rgb(220,220,220)">}</span></div></div></div><div><br></div><div><br><div style="color:rgb(212,212,212);background-color:rgb(30,30,30);font-family:Consolas,&quot;Liberation Mono&quot;,Courier,monospace,&quot;Droid Sans Mono&quot;,&quot;monospace&quot;,monospace;font-size:14px;line-height:19px;white-space:pre"><div><span style="color:rgb(86,156,214)">example2</span><span style="color:rgb(220,220,220)">()</span>:</div><div> <span style="color:rgb(86,156,214)">lea</span>    <span style="color:rgb(72,100,170)">rdi</span>,<span style="color:rgb(220,220,220)">[</span><span style="color:rgb(72,100,170)">rip</span>+<span style="color:rgb(91,180,152)">0xe69</span><span style="color:rgb(220,220,220)">]</span>        <span style="color:rgb(96,139,78)"># 2010 &lt;_IO_stdin_used+0x10&gt;</span></div><div> <span style="color:rgb(86,156,214)">xor</span>    <span style="color:rgb(72,100,170)">esi</span>,<span style="color:rgb(72,100,170)">esi</span></div><div> <span style="color:rgb(86,156,214)">mov</span>    <span style="color:rgb(72,100,170)">edx</span>,<span style="color:rgb(91,180,152)">0x3</span></div><div> <span style="color:rgb(86,156,214)">xor</span>    <span style="color:rgb(72,100,170)">eax</span>,<span style="color:rgb(72,100,170)">eax</span></div><div> <span style="color:rgb(86,156,214)">jmp</span>    <span style="color:rgb(181,206,168)">1030</span> <span style="color:rgb(220,220,220)">&lt;</span><span style="color:rgb(61,201,176)">printf@plt</span><span style="color:rgb(220,220,220)">&gt;</span></div></div></div><div><br></div><div><div style="color:rgb(212,212,212);background-color:rgb(30,30,30);font-family:Consolas,&quot;Liberation Mono&quot;,Courier,monospace,&quot;Droid Sans Mono&quot;,&quot;monospace&quot;,monospace;font-size:14px;line-height:19px;white-space:pre"><div><span style="color:rgb(86,156,214)">example2</span><span style="color:rgb(220,220,220)">()</span>:</div><div>        <span style="color:rgb(86,156,214)">mov</span>     <span style="color:rgb(72,100,170)">edx</span>, <span style="color:rgb(181,206,168)">3</span></div><div>        <span style="color:rgb(86,156,214)">xor</span>     <span style="color:rgb(72,100,170)">esi</span>, <span style="color:rgb(72,100,170)">esi</span></div><div>        <span style="color:rgb(86,156,214)">mov</span>     <span style="color:rgb(72,100,170)">edi</span>, <span style="color:rgb(61,201,176)">OFFSET</span> <span style="color:rgb(61,201,176)">FLAT</span>:<span style="color:rgb(61,201,176)">.LC0</span></div><div>        <span style="color:rgb(86,156,214)">xor</span>     <span style="color:rgb(72,100,170)">eax</span>, <span style="color:rgb(72,100,170)">eax</span></div><div>        <span style="color:rgb(86,156,214)">jmp</span>     <span style="color:rgb(61,201,176)">printf</span></div></div></div><div><br></div><div><a href="https://godbolt.org/z/43WjbqvM3">https://godbolt.org/z/43WjbqvM3</a></div><div>// I shortened your array from 3 to 2 elements, but if you want you can create 3 element std::array and then sort just first 2 elements, constexpr works fine.</div><div>Interesting that gcc manages to do the optimization even for your example with std::sort while clang fails.</div><div><br></div><div>Advantage of constexpr here is the following: when I see constexpr arr = something</div><div>I know arr is computed at compile time. In real big projects where I do not have time to look at asm for entire project this is a huge benefit in productivity.</div><div><br></div><div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">So, my first proposal is to have a template library that can operate<br>
using pointers and arrays without extra templated abstraction layers.<br>
Then the compiler does a better job, your compiler errors are nice and<br>
clean and the debugger is a relative joy to use. And when it comes to<br>
safety, the clang sanitizers can be used to make raw pointers just as<br>
safe as iterators these days.<br></blockquote><div><br></div><div>I think you are missing important point about iterators. Can your pointer based approach sort std::deque? std::sort can. </div><div>If hxsort can not sort std::deque you are basically asking for people to have multiple implementations of std::sort. </div><div>Also insertion sort has worst case complexity not allowed in C++, and more importantly for your example implementation is </div><div>much simpler than std::sort. So it may not be the iterators that are causing compiler to fail to optimize, but the difference in complexity</div><div>of sort implementations.</div><div><br></div><div><br></div><div><br></div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> I have seen a professionally written C++ codebase spend 3% of<br>
its time inside std::vector::operator[] with all optimizations turned<br>
on. (This is why the standard library gets banned from real-time<br>
embedded projects.) If we required support for -O9 and told everyone<br>
they may have to let the compiler run overnight then at least all this<br>
talk about what is possible with compile time evaluation would be less<br>
deceptive.</blockquote><div><br></div><div>While this idea is tempting I think this is not for C++ standard, more of something that compiler people could do. </div><div>But I presume you are not first person to come up with this idea. My <b>guess </b>(we would need to get definitive answer from compiler people) as to why there are no flags like O9 is:</div><div><ol><li>you can control this behavior with other flags</li><li>gains would be minimal, e.g. problem you described with sort is because of compiler optimization issues, not because it did not had enough time budget to get it correctly</li></ol><div><br></div></div><div><br></div><div>  </div></div></div>

