C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Return Value Optimisation whenever you need it (guaranteed elision)

From: Frederick Virchanza Gotham <cauldwell.thomas_at_[hidden]>
Date: Sun, 16 Jul 2023 22:22:12 +0100
On Sun, Jul 16, 2023 at 9:22 PM Frederick Virchanza Gotham
<cauldwell.thomas_at_[hidden]> wrote:
>
> I have Linux on my laptop here and I've got cross compilers and
> emulators for aarch64 and arm32, so I'll try it with them too just
> now.


First I tried 32-Bit ARM. Here's what I did at the Linux command line
on my x86_64 laptop:

      $ arm-linux-gnueabihf-g++-13 -o prog prog.cpp -static
      $ qemu-arm-static ./prog
      Args: 5 7.8 monkey
      construct 9
      lock 9
      unlock 9
      destroy 9

So that works fine. Here's what happened when I tried 64-Bit ARM (i.e. aarch64):

      $ aarch64-linux-gnu-g++-13 -o prog ./prog.cpp -static
      $ qemu-aarch64-static ./prog
      Args: 5381368 7.8 ?{???
      qemu: uncaught target signal 11 (Segmentation fault) - core dumped
      Segmentation fault (core dumped)

So the code runs a little less efficiently on 64-Bit ARM. I'm reading
the calling convention spec sheet here and it says that when you
return a class by value, the address of the allocated space is passed
in register X8. So I need to change the code a tiny bit. Instead of
having an extra parameter in first place, I need to use inline
assembler to get the value of the X8 register:

    void GiveMeLockedMutex_detail(int const a, double const b, char
const *const p)
    {
        Mutex *pm;
        asm ("mov x8, %0" : "=r" (pm));

        cout << "Args: " << a << " " << b << " " << p << endl;

        ::new(pm) Mutex(9);

        pm->lock();
    }

After making this change, it works. By the way I think I should have
been able to just use an "explicit register variable" as follows:

            register Mutex *pm asm ("x8");

but I couldn't get that to work.

So my original code seems to work on every operating system and CPU
architecture except for aarch64. But I have a working solution for
aarch64 also.

Received on 2023-07-16 21:22:21