Date: Sat, 11 Apr 2026 18:37:36 -0600
On Saturday, 11 April 2026 17:02:45 Mountain Daylight Time Thiago Macieira via
Std-Proposals wrote:
> On Saturday, 11 April 2026 16:10:47 Mountain Daylight Time Frederick
> Virchanza
> Gotham via Std-Proposals wrote:
> > __builtin_push_all_argument_registers_onto_stack(); // e.g.
> >
> > rdi, rsi, rcx, rdx, r8, r9
>
> You also need to push the XMM registers used for passing FP arguments.
> However, because functions can take __m256 and __m512, you must actually
> *ban* the use of any XMM register and can only call functions that likewise
> never touch them.
Actually, I saw this implemented recently in glibc, for the GNU TLS2 (or GNU2
TLS, depending on who's writing) implementation. If you're interested in the
implementation, it's _dl_tlsdesc_dynamic_(xsave|xsavec) (warning: LGPL code).
It isn't designed for a tail call. In fact, one major problem of your proposal
is that you need one register to be clobbered because you need to save the
address to which you'll jump in it. You can't do PUSH+RET because that breaks
control-flow enforcement. So there's no way this will work for an arbitrary
ABI. If we stick to the general ABIs, then there are several caller-save
registers that aren't using for parameter passing, and one of them specifically
reserved for function call thunks to use (I think it's r11).
Std-Proposals wrote:
> On Saturday, 11 April 2026 16:10:47 Mountain Daylight Time Frederick
> Virchanza
> Gotham via Std-Proposals wrote:
> > __builtin_push_all_argument_registers_onto_stack(); // e.g.
> >
> > rdi, rsi, rcx, rdx, r8, r9
>
> You also need to push the XMM registers used for passing FP arguments.
> However, because functions can take __m256 and __m512, you must actually
> *ban* the use of any XMM register and can only call functions that likewise
> never touch them.
Actually, I saw this implemented recently in glibc, for the GNU TLS2 (or GNU2
TLS, depending on who's writing) implementation. If you're interested in the
implementation, it's _dl_tlsdesc_dynamic_(xsave|xsavec) (warning: LGPL code).
It isn't designed for a tail call. In fact, one major problem of your proposal
is that you need one register to be clobbered because you need to save the
address to which you'll jump in it. You can't do PUSH+RET because that breaks
control-flow enforcement. So there's no way this will work for an arbitrary
ABI. If we stick to the general ABIs, then there are several caller-save
registers that aren't using for parameter passing, and one of them specifically
reserved for function call thunks to use (I think it's r11).
-- Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org Principal Engineer - Intel Data Center - Platform & Sys. Eng.
Received on 2026-04-12 00:37:48
