Date: Mon, 2 Jun 2025 20:19:50 +0200
On Mon, Jun 2, 2025 at 7:53 PM Tom Honermann via Std-Proposals <
std-proposals_at_[hidden]> wrote:
> On 6/2/25 12:43 PM, Simon Schröder via Std-Proposals wrote:
>
> On Jun 2, 2025, at 11:28 AM, Peter Bindels via Std-Proposals <std-proposals_at_[hidden]> <std-proposals_at_[hidden]> wrote:
>
> It has the same code bloat that templates has - arguably more, in case you have multiple assurements that not all are used - and it's a language addition that usually comes with a very high bar of "this is impossible to do right now without this".
>
> I understood the initial proposal slightly different. Sure, templates or tag dispatching could improve performance of function calls. At the same time they would increase code size. However, I believe the original intent was to have just a single function that can be called/jumped into different places. Only very few examples would actually take advantage of this feature. The example given was: The function has an initial check for the parameter range. If the caller can ensure that the argument always returns false for the check, we could (in theory) call the function just after the check. The restriction is that there can‘t be any other code before the check inside the called function (or the compiler can figure out how to rearrange the code without changing the functionality).
>
> Advantages:
> 1. Just code for a single function. No duplicate (compiled) code for different restricted parameters. Less code in the cache! (And thus better performance?)
> 2. Avoid/skip over the check. Might not be really relevant because of branch prediction unless we have a weird pattern. (How often does start get called with a negative number? Almost never -> branch predictor is always right.) But, back to 1: Less code in the cache and thus fewer branch predictions for the branch predictor to store.
>
> Challenges:
> 1. Annotate possible restrictions on parameters in function declarations that could provide a performance improvement. (Using assume(…) on the caller site should tell the compiler everything it needs to know for the call. No extra keyword needed on the side of the caller.)
> 2. The calling translation unit and the called translation unit need to communicate to the linker how to bring the two together. In the case of a single function with multiple entry points I am not sure if there is an existing feature in linkers that can be used. (Wasn‘t there a C++ talk last year about misunderstood coding guidelines/style guides? Wasn‘t one of the rules originally meant to say that multiple entry points into a function is bad practice? Is that alright in the case of the compiler proving that the optimization is correct? Are there problems with debugging such a function?)
>
> These concerns were discussed at length in WG21 during the development of P2900
> (Contracts for C++) <https://wg21.link/p2900>. Unfortunately, that
> discussion is not public, so can't be shared. The P2900 design is intended
> to allow implementors to provide options for such performance improving
> transformations. There are ABI considerations in providing multiple entry
> points with regard to parameter passing or what it means to take the
> address of the function.
>
For reference, a discussion on how you could implement P2900 with benefits
for checked contracts, without breaking use from compilers that do not
check the contracts can be found in P3267. It does indeed likely give the
same benefits as this proposal. I do still see a difference in that this
`assure` proposal would consider both calls to be in-contract, but one of
them to choose a faster path, while P2900 considers one to be
out-of-contract.
I will say that that is a very thin advantage and adding a full language
keyword plus syntax for such a thing IMO is not worth it.
> I don't see a need for more in-source syntax to opt-in to the benefits
> this proposal seeks. I instead recommend investigating how to achieve the
> advantages described above as details of a P2900 implementation; e.g., hack
> on the gcc or clang P2900 implementation to see what can be achieved and
> what limitations are imposed.
>
It would indeed be very welcome to have an implementation of P3267's
approach #3, so that we can find out in reality how much it affects bloat,
code size, speed etc.
std-proposals_at_[hidden]> wrote:
> On 6/2/25 12:43 PM, Simon Schröder via Std-Proposals wrote:
>
> On Jun 2, 2025, at 11:28 AM, Peter Bindels via Std-Proposals <std-proposals_at_[hidden]> <std-proposals_at_[hidden]> wrote:
>
> It has the same code bloat that templates has - arguably more, in case you have multiple assurements that not all are used - and it's a language addition that usually comes with a very high bar of "this is impossible to do right now without this".
>
> I understood the initial proposal slightly different. Sure, templates or tag dispatching could improve performance of function calls. At the same time they would increase code size. However, I believe the original intent was to have just a single function that can be called/jumped into different places. Only very few examples would actually take advantage of this feature. The example given was: The function has an initial check for the parameter range. If the caller can ensure that the argument always returns false for the check, we could (in theory) call the function just after the check. The restriction is that there can‘t be any other code before the check inside the called function (or the compiler can figure out how to rearrange the code without changing the functionality).
>
> Advantages:
> 1. Just code for a single function. No duplicate (compiled) code for different restricted parameters. Less code in the cache! (And thus better performance?)
> 2. Avoid/skip over the check. Might not be really relevant because of branch prediction unless we have a weird pattern. (How often does start get called with a negative number? Almost never -> branch predictor is always right.) But, back to 1: Less code in the cache and thus fewer branch predictions for the branch predictor to store.
>
> Challenges:
> 1. Annotate possible restrictions on parameters in function declarations that could provide a performance improvement. (Using assume(…) on the caller site should tell the compiler everything it needs to know for the call. No extra keyword needed on the side of the caller.)
> 2. The calling translation unit and the called translation unit need to communicate to the linker how to bring the two together. In the case of a single function with multiple entry points I am not sure if there is an existing feature in linkers that can be used. (Wasn‘t there a C++ talk last year about misunderstood coding guidelines/style guides? Wasn‘t one of the rules originally meant to say that multiple entry points into a function is bad practice? Is that alright in the case of the compiler proving that the optimization is correct? Are there problems with debugging such a function?)
>
> These concerns were discussed at length in WG21 during the development of P2900
> (Contracts for C++) <https://wg21.link/p2900>. Unfortunately, that
> discussion is not public, so can't be shared. The P2900 design is intended
> to allow implementors to provide options for such performance improving
> transformations. There are ABI considerations in providing multiple entry
> points with regard to parameter passing or what it means to take the
> address of the function.
>
For reference, a discussion on how you could implement P2900 with benefits
for checked contracts, without breaking use from compilers that do not
check the contracts can be found in P3267. It does indeed likely give the
same benefits as this proposal. I do still see a difference in that this
`assure` proposal would consider both calls to be in-contract, but one of
them to choose a faster path, while P2900 considers one to be
out-of-contract.
I will say that that is a very thin advantage and adding a full language
keyword plus syntax for such a thing IMO is not worth it.
> I don't see a need for more in-source syntax to opt-in to the benefits
> this proposal seeks. I instead recommend investigating how to achieve the
> advantages described above as details of a P2900 implementation; e.g., hack
> on the gcc or clang P2900 implementation to see what can be achieved and
> what limitations are imposed.
>
It would indeed be very welcome to have an implementation of P3267's
approach #3, so that we can find out in reality how much it affects bloat,
code size, speed etc.
Received on 2025-06-02 18:20:04