Date: Thu, 07 Aug 2025 13:21:37 -0700
> On Aug 7, 2025, at 12:59 PM, Hans Åberg <haberg_1_at_[hidden]> wrote:
>
>
>> On 7 Aug 2025, at 21:44, Oliver Hunt <oliver_at_[hidden]> wrote:
>>
>>> On Aug 7, 2025, at 11:45 AM, Hans Åberg <haberg_1_at_[hidden]> wrote:
>>>
>>> One reason I choose a[] is to avoid overhead given the low programming level. This excludes iterators, as division goes top down, and reverse iterators are offset by one, with lots of hidden additions.
>>>
>>> Another is that I could not find a good C++ construct that admits switching between dynamic and static allocations. A quick test on std::span shows that problem, but it could be that it can be made to work. Otherwise, I would prefer to keep size and value together, as in std::span.
>>
>> Adding new pointer based apis just is not something we can reasonably accept at this point, and std::span is exactly equivalent to bounded pointers - can you point to your code (or on godbolt?) showing the difference in codegen?
>>
>> If there’s a significant difference in codegen that implies an optimizer bug rather than an X-is-bad problem.
>
> I tried changing it in my “div” function, and then the “div32” example I posted failed. So I will have to look at it at some other time.
>
> It is not useful with std::span, because one will always go down to 0. So it is mostly to get a nice interface, but the question is whether it generates an overhead.
As a simple example I made a terrible multi precision add, and there’s no difference in codegen between the array and the std::span version:
https://godbolt.org/z/7q3Ksqsdc
That's why I’d like to see the division code - it’s possible it’s causing things to go surprisingly wrong in the optimizers if you’re seeing meaningful perf regressions.
—Oliver
>
>
>> On 7 Aug 2025, at 21:44, Oliver Hunt <oliver_at_[hidden]> wrote:
>>
>>> On Aug 7, 2025, at 11:45 AM, Hans Åberg <haberg_1_at_[hidden]> wrote:
>>>
>>> One reason I choose a[] is to avoid overhead given the low programming level. This excludes iterators, as division goes top down, and reverse iterators are offset by one, with lots of hidden additions.
>>>
>>> Another is that I could not find a good C++ construct that admits switching between dynamic and static allocations. A quick test on std::span shows that problem, but it could be that it can be made to work. Otherwise, I would prefer to keep size and value together, as in std::span.
>>
>> Adding new pointer based apis just is not something we can reasonably accept at this point, and std::span is exactly equivalent to bounded pointers - can you point to your code (or on godbolt?) showing the difference in codegen?
>>
>> If there’s a significant difference in codegen that implies an optimizer bug rather than an X-is-bad problem.
>
> I tried changing it in my “div” function, and then the “div32” example I posted failed. So I will have to look at it at some other time.
>
> It is not useful with std::span, because one will always go down to 0. So it is mostly to get a nice interface, but the question is whether it generates an overhead.
As a simple example I made a terrible multi precision add, and there’s no difference in codegen between the array and the std::span version:
https://godbolt.org/z/7q3Ksqsdc
That's why I’d like to see the division code - it’s possible it’s causing things to go surprisingly wrong in the optimizers if you’re seeing meaningful perf regressions.
—Oliver
Received on 2025-08-07 20:21:49