Date: Thu, 17 Oct 2024 21:20:25 +0300
P2889 is prior art in this space.
- Elias
On 10/17/24 5:58 PM, Federico Kircheis via Std-Discussion wrote:
> Hello,
>
> I would like to have, in c++, a way to "register" objects from
> different translation units in a global container (an array).
>
> This is already possible in gcc and msvc with tools outside of the
> language, with multiple drawbacks.
>
> To make it more clear what I am talking about, here are one example
> with gcc
>
> ----
> #include <cstdio>
> #include <cstdint>
> #include <span>
>
> using test_signature = void();
>
> #define CONCAT_IMPL(x, y) x##y
> #define CONCAT(x, y) CONCAT_IMPL(x, y)
> #define REGISTER_FUN(name) \
> void name();\
> [[gnu::used]] constexpr auto CONCAT(helper, __LINE__)
> [[gnu::section(".tmptests")]] = &name; \
> void name()
>
> REGISTER_FUN(test1){std::puts("test1");}
> REGISTER_FUN(test2){std::puts("test1");}
>
> std::span<test_signature*> get_tests() noexcept {
> extern test_signature* tests_begin[];
> extern test_signature* tests_end[];
> const auto tests_size = ((uintptr_t)(tests_end) -
> (uintptr_t)(tests_begin))/sizeof(test_signature*);
> test_signature** begin = tests_begin;
> asm("":"+r"(begin));
> return std::span<test_signature*>(begin, begin + tests_size);
> }
>
> int main(){
> auto funcs = get_tests();
> for(const auto& v : funcs){
> v();
> }
> }
> ----
>
> compile and execute with
>
> gcc --std=c++20 -Wl,-Tlinkerscript.ld main.cpp
>
> where linkerscript.ld is
>
> ----
> SECTIONS
> {
> tests (READONLY) : {
> PROVIDE(tests_begin = .);
> KEEP(*(.tmptests))
> PROVIDE(tests_end = .);
> }
> }
> INSERT AFTER .text;
> ----
>
>
> And one example in msvc
>
> ----
> #include <cstdio>
> #include <span>
> #include <iostream>
>
> using test_signature = void();
>
> #pragma comment(linker, "/merge:tests=.rdata") // from comments from
> https://learn.microsoft.com/en-us/archive/blogs/larryosterman/when-i-moved-my-code-into-a-library-what-happened-to-my-atl-com-objects
> #pragma section("tests$a", read)
> #pragma section("tests$b", read)
> #pragma section("tests$c", read)
>
>
> #define CONCAT_IMPL(x, y) x##y
> #define CONCAT(x, y) CONCAT_IMPL(x, y)
>
> // NOTE: needs extern, or msvc optimizes out, gcc had similar issue
> but fixed with attribute
> // check OBJECT_ENTRY_PRAGMA
> #define REGISTER_FUN(name) \
> void name(); \
> extern __declspec(allocate("tests$b")) constexpr auto CONCAT(helper,
> __LINE__) = &name; \
> void name()
>
> REGISTER_FUN(test1){std::puts("test1");}
> REGISTER_FUN(test2){std::puts("test2");}
>
> std::span<test_signature*> get_tests() noexcept {
> __declspec(allocate("tests$a")) static constinit test_signature*
> tests_begin = nullptr;
> __declspec(allocate("tests$c")) static constinit test_signature*
> tests_end = nullptr;
> const auto tests_size = ((uintptr_t)(&tests_end) -
> (uintptr_t)(&tests_begin))/sizeof(test_signature*);
> auto begin = &tests_begin;
> return std::span(begin, begin + tests_size);
> }
>
> int main(void)
> {
> auto funcs = get_tests();
> for(const auto& v : funcs){
> if(v){
> v();
> };
> }
> }
> ----
>
> https://godbolt.org/z/b5c7fE19j
>
>
> In both cases, the macro is only there for brevity.
>
> 1) Why would I like to have it standardized?
>
>
> a) Because the implementation is very error-prone, as there is no
> array to iterate over, requiring the usage of inline assembly for gcc,
> and I am not sure what for msvc.
>
> I believe that for some cases start_lifetime_as_array could be used,
> but it "only" works for cv void*, thus function pointers and member
> functions pointer are potentially left out.
>
> b) uintptr_t might not be available, but iterating over an array
> should always be possible
>
> c) the approach with GCC requires to modify the build environment, and
> might create invalid binary files (I was at least able to create
> binaries that run successfully, but tools like nm and objdump where
> not able to analyze), opening a whole new can of worms
>
> d) the msvc implementation might have gaps between element, this gap
> have value 0, thus it is not possible to store some types
>
> e) (AFAIK) the only alternative available in c++ is to allocate
> memory, something like
>
> ----
> constinit std::vector<test_signature*> tests;
>
> #define REGISTER_FUN(name) \
> static void name(); \
> int CONCAT(impl, __LINE__) = (tests.push_back(&name),0); \
> static void name()
>
>
> REGISTER_FUN(test1){std::puts("test1");}
> REGISTER_FUN(test2){std::puts("test1");}
> ----
>
>
> but it requires to allocate memory, which is problematic in some
> environments, and uses mutable memory (see EncodePointer/DecodePointer
> of msvc).
>
> Using an array with maximum length still requires mutable memory, and
> it could be not possible to determine a sensible maximum length.
>
>
> 2) Why should it be standardized? / What are the use-cases?
>
> A similar technique has been used in the MFC framework and linux
> kernel (currently missing sources).
>
> Main use-cases would be plugin systems, where code can register hooks,
> and test suites (if you have ever used the catch test suite, rename
> REGISTER_FUN to TEST_CASE and we have something that can be used for
> registering at compile-time test functions).
>
> 3) What features are expected by standardizing a way to create an
> array of elements from multiple translation units
>
> a) no change to the build system
> b) works with all types
> c) array is initialized at compile-time, even if it cannot be used in
> constexpr context
> d) less error-prone than using linker scripts
> e) no assembly, casts, extern, or start_lifetime* required
> f) array is const-correct
> g) no memory allocation
>
>
> 3) Why did I not write a paper?
>
> a) first, I wanted to gther some feedback. Maybe something similar has
> already been proposed and rejected.
>
> b) It requires core wording changes I currently do not know how to
> express.
>
>
> Let me know what you think
>
> Best
>
> Federico
- Elias
On 10/17/24 5:58 PM, Federico Kircheis via Std-Discussion wrote:
> Hello,
>
> I would like to have, in c++, a way to "register" objects from
> different translation units in a global container (an array).
>
> This is already possible in gcc and msvc with tools outside of the
> language, with multiple drawbacks.
>
> To make it more clear what I am talking about, here are one example
> with gcc
>
> ----
> #include <cstdio>
> #include <cstdint>
> #include <span>
>
> using test_signature = void();
>
> #define CONCAT_IMPL(x, y) x##y
> #define CONCAT(x, y) CONCAT_IMPL(x, y)
> #define REGISTER_FUN(name) \
> void name();\
> [[gnu::used]] constexpr auto CONCAT(helper, __LINE__)
> [[gnu::section(".tmptests")]] = &name; \
> void name()
>
> REGISTER_FUN(test1){std::puts("test1");}
> REGISTER_FUN(test2){std::puts("test1");}
>
> std::span<test_signature*> get_tests() noexcept {
> extern test_signature* tests_begin[];
> extern test_signature* tests_end[];
> const auto tests_size = ((uintptr_t)(tests_end) -
> (uintptr_t)(tests_begin))/sizeof(test_signature*);
> test_signature** begin = tests_begin;
> asm("":"+r"(begin));
> return std::span<test_signature*>(begin, begin + tests_size);
> }
>
> int main(){
> auto funcs = get_tests();
> for(const auto& v : funcs){
> v();
> }
> }
> ----
>
> compile and execute with
>
> gcc --std=c++20 -Wl,-Tlinkerscript.ld main.cpp
>
> where linkerscript.ld is
>
> ----
> SECTIONS
> {
> tests (READONLY) : {
> PROVIDE(tests_begin = .);
> KEEP(*(.tmptests))
> PROVIDE(tests_end = .);
> }
> }
> INSERT AFTER .text;
> ----
>
>
> And one example in msvc
>
> ----
> #include <cstdio>
> #include <span>
> #include <iostream>
>
> using test_signature = void();
>
> #pragma comment(linker, "/merge:tests=.rdata") // from comments from
> https://learn.microsoft.com/en-us/archive/blogs/larryosterman/when-i-moved-my-code-into-a-library-what-happened-to-my-atl-com-objects
> #pragma section("tests$a", read)
> #pragma section("tests$b", read)
> #pragma section("tests$c", read)
>
>
> #define CONCAT_IMPL(x, y) x##y
> #define CONCAT(x, y) CONCAT_IMPL(x, y)
>
> // NOTE: needs extern, or msvc optimizes out, gcc had similar issue
> but fixed with attribute
> // check OBJECT_ENTRY_PRAGMA
> #define REGISTER_FUN(name) \
> void name(); \
> extern __declspec(allocate("tests$b")) constexpr auto CONCAT(helper,
> __LINE__) = &name; \
> void name()
>
> REGISTER_FUN(test1){std::puts("test1");}
> REGISTER_FUN(test2){std::puts("test2");}
>
> std::span<test_signature*> get_tests() noexcept {
> __declspec(allocate("tests$a")) static constinit test_signature*
> tests_begin = nullptr;
> __declspec(allocate("tests$c")) static constinit test_signature*
> tests_end = nullptr;
> const auto tests_size = ((uintptr_t)(&tests_end) -
> (uintptr_t)(&tests_begin))/sizeof(test_signature*);
> auto begin = &tests_begin;
> return std::span(begin, begin + tests_size);
> }
>
> int main(void)
> {
> auto funcs = get_tests();
> for(const auto& v : funcs){
> if(v){
> v();
> };
> }
> }
> ----
>
> https://godbolt.org/z/b5c7fE19j
>
>
> In both cases, the macro is only there for brevity.
>
> 1) Why would I like to have it standardized?
>
>
> a) Because the implementation is very error-prone, as there is no
> array to iterate over, requiring the usage of inline assembly for gcc,
> and I am not sure what for msvc.
>
> I believe that for some cases start_lifetime_as_array could be used,
> but it "only" works for cv void*, thus function pointers and member
> functions pointer are potentially left out.
>
> b) uintptr_t might not be available, but iterating over an array
> should always be possible
>
> c) the approach with GCC requires to modify the build environment, and
> might create invalid binary files (I was at least able to create
> binaries that run successfully, but tools like nm and objdump where
> not able to analyze), opening a whole new can of worms
>
> d) the msvc implementation might have gaps between element, this gap
> have value 0, thus it is not possible to store some types
>
> e) (AFAIK) the only alternative available in c++ is to allocate
> memory, something like
>
> ----
> constinit std::vector<test_signature*> tests;
>
> #define REGISTER_FUN(name) \
> static void name(); \
> int CONCAT(impl, __LINE__) = (tests.push_back(&name),0); \
> static void name()
>
>
> REGISTER_FUN(test1){std::puts("test1");}
> REGISTER_FUN(test2){std::puts("test1");}
> ----
>
>
> but it requires to allocate memory, which is problematic in some
> environments, and uses mutable memory (see EncodePointer/DecodePointer
> of msvc).
>
> Using an array with maximum length still requires mutable memory, and
> it could be not possible to determine a sensible maximum length.
>
>
> 2) Why should it be standardized? / What are the use-cases?
>
> A similar technique has been used in the MFC framework and linux
> kernel (currently missing sources).
>
> Main use-cases would be plugin systems, where code can register hooks,
> and test suites (if you have ever used the catch test suite, rename
> REGISTER_FUN to TEST_CASE and we have something that can be used for
> registering at compile-time test functions).
>
> 3) What features are expected by standardizing a way to create an
> array of elements from multiple translation units
>
> a) no change to the build system
> b) works with all types
> c) array is initialized at compile-time, even if it cannot be used in
> constexpr context
> d) less error-prone than using linker scripts
> e) no assembly, casts, extern, or start_lifetime* required
> f) array is const-correct
> g) no memory allocation
>
>
> 3) Why did I not write a paper?
>
> a) first, I wanted to gther some feedback. Maybe something similar has
> already been proposed and rejected.
>
> b) It requires core wording changes I currently do not know how to
> express.
>
>
> Let me know what you think
>
> Best
>
> Federico
Received on 2024-10-17 18:20:36