Date: Fri, 17 May 2024 10:11:41 +0100
On Fri, May 17, 2024 at 8:40 AM Lénárd Szolnoki wrote:
>
> >Tomorrow I'll try make a similar patch for the Microsoft
> >implementation of std::tuple. Then if I have a viable patch file for
> >both compilers, I think it would make sense to write a paper to
> >propose that std::tuple shall be trivially copyable whenever all of
> >its elements are trivially copyable. Same goes for std::pair.
>
> Making these trivially copyable would be an ABI change, as they
> would get passed around in registers if they are small enough.
Patching the Microsoft tuple was just 7 lines:
https://github.com/healytpk/tuple_trivially_copyable/commit/f174f4579fbb7eab54dfdbab1639f706d630cca7
So Lénárd, if we had a tuple such as std::tuple<int,int>, and then if
we have a function such as:
void Func( std::tuple<int,int> ); // pass by value
Then you reckon on Linux x86_64, whereas it was previously invoked as:
mov rdi, address_of_tuple
call Func
That it might, after the change, be invoked as:
mov rdi, [address_of_tupe+0]
mov rsi, [address_of_tupe+4]
call Func
I tried this up on GodBolt by writing the following function:
using T = std::tuple<int,int>;
int Func(T arg)
{
return std::get<0u>(arg) + std::get<1u>(arg) +
std::is_trivially_copyable_v<T>;
}
Using the old tuple, we get:
Func(std::tuple<int, int>):
mov eax, DWORD PTR [rdi+4]
add eax, DWORD PTR [rdi]
ret
And using the new tuple, we get:
Func(std::tuple<int, int>):
mov rax, rdi
shr rax, 32
lea eax, [rax+1+rdi]
ret
If I replace that mysterious "lea" instruction with "add", we get:
Func(std::tuple<int, int>):
mov rax, rdi ; Copy rdi to rax
shr rax, 32 ; Shift rax right by 32 bits,
isolating the upper 32 bits
add eax, edi ; Add the lower 32 bits of rdi
add eax, 1 ; Add 1 to eax
ret ; Return, with the result in eax
So... When the compiler works with a trivially-copyable tuple, it takes:
Func(std::tuple<int, int>):
and turns it into:
Func( uint64_t ):
and then it gets invoked as follows: Func( ((uint64_t)get<0>(t) <<
32u) | get<1>(t) );
I'm surprised it used one register instead of two, I would have used
RDI and RSI instead of doing bitshifting on RDI.
Anyway I think it was a big mistake back in 2011 to introduce
std::tuple without mandating that it be trivially copyable if all of
its element types are trivially copyable. In order to try rectify this
mistake in 2024 without causing an ABI break, maybe we could make a
new type: std::tuple_trivially_copyable, but all of its constructors
are private. And then std::tuple has methods:
template<typename... Ts>
class tuple {
public:
constexpr tuple_trivially_copyable
&get_trivially_copyable(void) & noexcept requires
(is_trivially_copyable_v<Ts> && ...);
constexpr tuple_trivially_copyable const
&get_trivially_copyable(void) const & noexcept requires
(is_trivially_copyable_v<Ts> && ...);
constexpr tuple_trivially_copyable
&&get_trivially_copyable(void) && noexcept requires
(is_trivially_copyable_v<Ts> && ...);
constexpr tuple_trivially_copyable const
&&get_trivially_copyable(void) const && noexcept requires
(is_trivially_copyable_v<Ts> && ...);
};
And so then if you want to do a consteval bit_cast of a tuple, you do:
std::bit_cast< std::array<char unsigned, sizeof(my_tuple)> >(
my_tuple.get_trivially_copyable() );
Or . . . to make things even simpler, add a paragraph to the standard:
"Where bit_cast is invoked with a source type that is a
specialization of std::tuple, and if all of the tuple's
element types are trivially copyable, the tuple is
implicitly converted to an std::tuple_trivially_copyable
before the bit_cast is performed".
which would mean that you can do the following in a consteval context:
std::bit_cast< std::array<char unsigned, sizeof(my_tuple)> >( my_tuple );
How does that sound?
>
> >Tomorrow I'll try make a similar patch for the Microsoft
> >implementation of std::tuple. Then if I have a viable patch file for
> >both compilers, I think it would make sense to write a paper to
> >propose that std::tuple shall be trivially copyable whenever all of
> >its elements are trivially copyable. Same goes for std::pair.
>
> Making these trivially copyable would be an ABI change, as they
> would get passed around in registers if they are small enough.
Patching the Microsoft tuple was just 7 lines:
https://github.com/healytpk/tuple_trivially_copyable/commit/f174f4579fbb7eab54dfdbab1639f706d630cca7
So Lénárd, if we had a tuple such as std::tuple<int,int>, and then if
we have a function such as:
void Func( std::tuple<int,int> ); // pass by value
Then you reckon on Linux x86_64, whereas it was previously invoked as:
mov rdi, address_of_tuple
call Func
That it might, after the change, be invoked as:
mov rdi, [address_of_tupe+0]
mov rsi, [address_of_tupe+4]
call Func
I tried this up on GodBolt by writing the following function:
using T = std::tuple<int,int>;
int Func(T arg)
{
return std::get<0u>(arg) + std::get<1u>(arg) +
std::is_trivially_copyable_v<T>;
}
Using the old tuple, we get:
Func(std::tuple<int, int>):
mov eax, DWORD PTR [rdi+4]
add eax, DWORD PTR [rdi]
ret
And using the new tuple, we get:
Func(std::tuple<int, int>):
mov rax, rdi
shr rax, 32
lea eax, [rax+1+rdi]
ret
If I replace that mysterious "lea" instruction with "add", we get:
Func(std::tuple<int, int>):
mov rax, rdi ; Copy rdi to rax
shr rax, 32 ; Shift rax right by 32 bits,
isolating the upper 32 bits
add eax, edi ; Add the lower 32 bits of rdi
add eax, 1 ; Add 1 to eax
ret ; Return, with the result in eax
So... When the compiler works with a trivially-copyable tuple, it takes:
Func(std::tuple<int, int>):
and turns it into:
Func( uint64_t ):
and then it gets invoked as follows: Func( ((uint64_t)get<0>(t) <<
32u) | get<1>(t) );
I'm surprised it used one register instead of two, I would have used
RDI and RSI instead of doing bitshifting on RDI.
Anyway I think it was a big mistake back in 2011 to introduce
std::tuple without mandating that it be trivially copyable if all of
its element types are trivially copyable. In order to try rectify this
mistake in 2024 without causing an ABI break, maybe we could make a
new type: std::tuple_trivially_copyable, but all of its constructors
are private. And then std::tuple has methods:
template<typename... Ts>
class tuple {
public:
constexpr tuple_trivially_copyable
&get_trivially_copyable(void) & noexcept requires
(is_trivially_copyable_v<Ts> && ...);
constexpr tuple_trivially_copyable const
&get_trivially_copyable(void) const & noexcept requires
(is_trivially_copyable_v<Ts> && ...);
constexpr tuple_trivially_copyable
&&get_trivially_copyable(void) && noexcept requires
(is_trivially_copyable_v<Ts> && ...);
constexpr tuple_trivially_copyable const
&&get_trivially_copyable(void) const && noexcept requires
(is_trivially_copyable_v<Ts> && ...);
};
And so then if you want to do a consteval bit_cast of a tuple, you do:
std::bit_cast< std::array<char unsigned, sizeof(my_tuple)> >(
my_tuple.get_trivially_copyable() );
Or . . . to make things even simpler, add a paragraph to the standard:
"Where bit_cast is invoked with a source type that is a
specialization of std::tuple, and if all of the tuple's
element types are trivially copyable, the tuple is
implicitly converted to an std::tuple_trivially_copyable
before the bit_cast is performed".
which would mean that you can do the following in a consteval context:
std::bit_cast< std::array<char unsigned, sizeof(my_tuple)> >( my_tuple );
How does that sound?
Received on 2024-05-17 09:11:55