Date: Thu, 17 Oct 2024 22:42:29 +0200
On 17/10/2024 20.20, Elias Kosunen wrote:
> P2889 is prior art in this space.
>
> - Elias
Thank you.
The paper is relatively new (from 2023), was it ever discussed?
> On 10/17/24 5:58 PM, Federico Kircheis via Std-Discussion wrote:
>> Hello,
>>
>> I would like to have, in c++, a way to "register" objects from
>> different translation units in a global container (an array).
>>
>> This is already possible in gcc and msvc with tools outside of the
>> language, with multiple drawbacks.
>>
>> To make it more clear what I am talking about, here are one example
>> with gcc
>>
>> ----
>> #include <cstdio>
>> #include <cstdint>
>> #include <span>
>>
>> using test_signature = void();
>>
>> #define CONCAT_IMPL(x, y) x##y
>> #define CONCAT(x, y) CONCAT_IMPL(x, y)
>> #define REGISTER_FUN(name) \
>> void name();\
>> [[gnu::used]] constexpr auto CONCAT(helper, __LINE__)
>> [[gnu::section(".tmptests")]] = &name; \
>> void name()
>>
>> REGISTER_FUN(test1){std::puts("test1");}
>> REGISTER_FUN(test2){std::puts("test1");}
>>
>> std::span<test_signature*> get_tests() noexcept {
>> extern test_signature* tests_begin[];
>> extern test_signature* tests_end[];
>> const auto tests_size = ((uintptr_t)(tests_end) - (uintptr_t)
>> (tests_begin))/sizeof(test_signature*);
>> test_signature** begin = tests_begin;
>> asm("":"+r"(begin));
>> return std::span<test_signature*>(begin, begin + tests_size);
>> }
>>
>> int main(){
>> auto funcs = get_tests();
>> for(const auto& v : funcs){
>> v();
>> }
>> }
>> ----
>>
>> compile and execute with
>>
>> gcc --std=c++20 -Wl,-Tlinkerscript.ld main.cpp
>>
>> where linkerscript.ld is
>>
>> ----
>> SECTIONS
>> {
>> tests (READONLY) : {
>> PROVIDE(tests_begin = .);
>> KEEP(*(.tmptests))
>> PROVIDE(tests_end = .);
>> }
>> }
>> INSERT AFTER .text;
>> ----
>>
>>
>> And one example in msvc
>>
>> ----
>> #include <cstdio>
>> #include <span>
>> #include <iostream>
>>
>> using test_signature = void();
>>
>> #pragma comment(linker, "/merge:tests=.rdata") // from comments from
>> https://learn.microsoft.com/en-us/archive/blogs/larryosterman/when-i-
>> moved-my-code-into-a-library-what-happened-to-my-atl-com-objects
>> #pragma section("tests$a", read)
>> #pragma section("tests$b", read)
>> #pragma section("tests$c", read)
>>
>>
>> #define CONCAT_IMPL(x, y) x##y
>> #define CONCAT(x, y) CONCAT_IMPL(x, y)
>>
>> // NOTE: needs extern, or msvc optimizes out, gcc had similar issue
>> but fixed with attribute
>> // check OBJECT_ENTRY_PRAGMA
>> #define REGISTER_FUN(name) \
>> void name(); \
>> extern __declspec(allocate("tests$b")) constexpr auto CONCAT(helper,
>> __LINE__) = &name; \
>> void name()
>>
>> REGISTER_FUN(test1){std::puts("test1");}
>> REGISTER_FUN(test2){std::puts("test2");}
>>
>> std::span<test_signature*> get_tests() noexcept {
>> __declspec(allocate("tests$a")) static constinit test_signature*
>> tests_begin = nullptr;
>> __declspec(allocate("tests$c")) static constinit test_signature*
>> tests_end = nullptr;
>> const auto tests_size = ((uintptr_t)(&tests_end) - (uintptr_t)
>> (&tests_begin))/sizeof(test_signature*);
>> auto begin = &tests_begin;
>> return std::span(begin, begin + tests_size);
>> }
>>
>> int main(void)
>> {
>> auto funcs = get_tests();
>> for(const auto& v : funcs){
>> if(v){
>> v();
>> };
>> }
>> }
>> ----
>>
>> https://godbolt.org/z/b5c7fE19j
>>
>>
>> In both cases, the macro is only there for brevity.
>>
>> 1) Why would I like to have it standardized?
>>
>>
>> a) Because the implementation is very error-prone, as there is no
>> array to iterate over, requiring the usage of inline assembly for gcc,
>> and I am not sure what for msvc.
>>
>> I believe that for some cases start_lifetime_as_array could be used,
>> but it "only" works for cv void*, thus function pointers and member
>> functions pointer are potentially left out.
>>
>> b) uintptr_t might not be available, but iterating over an array
>> should always be possible
>>
>> c) the approach with GCC requires to modify the build environment, and
>> might create invalid binary files (I was at least able to create
>> binaries that run successfully, but tools like nm and objdump where
>> not able to analyze), opening a whole new can of worms
>>
>> d) the msvc implementation might have gaps between element, this gap
>> have value 0, thus it is not possible to store some types
>>
>> e) (AFAIK) the only alternative available in c++ is to allocate
>> memory, something like
>>
>> ----
>> constinit std::vector<test_signature*> tests;
>>
>> #define REGISTER_FUN(name) \
>> static void name(); \
>> int CONCAT(impl, __LINE__) = (tests.push_back(&name),0); \
>> static void name()
>>
>>
>> REGISTER_FUN(test1){std::puts("test1");}
>> REGISTER_FUN(test2){std::puts("test1");}
>> ----
>>
>>
>> but it requires to allocate memory, which is problematic in some
>> environments, and uses mutable memory (see EncodePointer/DecodePointer
>> of msvc).
>>
>> Using an array with maximum length still requires mutable memory, and
>> it could be not possible to determine a sensible maximum length.
>>
>>
>> 2) Why should it be standardized? / What are the use-cases?
>>
>> A similar technique has been used in the MFC framework and linux
>> kernel (currently missing sources).
>>
>> Main use-cases would be plugin systems, where code can register hooks,
>> and test suites (if you have ever used the catch test suite, rename
>> REGISTER_FUN to TEST_CASE and we have something that can be used for
>> registering at compile-time test functions).
>>
>> 3) What features are expected by standardizing a way to create an
>> array of elements from multiple translation units
>>
>> a) no change to the build system
>> b) works with all types
>> c) array is initialized at compile-time, even if it cannot be used in
>> constexpr context
>> d) less error-prone than using linker scripts
>> e) no assembly, casts, extern, or start_lifetime* required
>> f) array is const-correct
>> g) no memory allocation
>>
>>
>> 3) Why did I not write a paper?
>>
>> a) first, I wanted to gther some feedback. Maybe something similar has
>> already been proposed and rejected.
>>
>> b) It requires core wording changes I currently do not know how to
>> express.
>>
>>
>> Let me know what you think
>>
>> Best
>>
>> Federico
> P2889 is prior art in this space.
>
> - Elias
Thank you.
The paper is relatively new (from 2023), was it ever discussed?
> On 10/17/24 5:58 PM, Federico Kircheis via Std-Discussion wrote:
>> Hello,
>>
>> I would like to have, in c++, a way to "register" objects from
>> different translation units in a global container (an array).
>>
>> This is already possible in gcc and msvc with tools outside of the
>> language, with multiple drawbacks.
>>
>> To make it more clear what I am talking about, here are one example
>> with gcc
>>
>> ----
>> #include <cstdio>
>> #include <cstdint>
>> #include <span>
>>
>> using test_signature = void();
>>
>> #define CONCAT_IMPL(x, y) x##y
>> #define CONCAT(x, y) CONCAT_IMPL(x, y)
>> #define REGISTER_FUN(name) \
>> void name();\
>> [[gnu::used]] constexpr auto CONCAT(helper, __LINE__)
>> [[gnu::section(".tmptests")]] = &name; \
>> void name()
>>
>> REGISTER_FUN(test1){std::puts("test1");}
>> REGISTER_FUN(test2){std::puts("test1");}
>>
>> std::span<test_signature*> get_tests() noexcept {
>> extern test_signature* tests_begin[];
>> extern test_signature* tests_end[];
>> const auto tests_size = ((uintptr_t)(tests_end) - (uintptr_t)
>> (tests_begin))/sizeof(test_signature*);
>> test_signature** begin = tests_begin;
>> asm("":"+r"(begin));
>> return std::span<test_signature*>(begin, begin + tests_size);
>> }
>>
>> int main(){
>> auto funcs = get_tests();
>> for(const auto& v : funcs){
>> v();
>> }
>> }
>> ----
>>
>> compile and execute with
>>
>> gcc --std=c++20 -Wl,-Tlinkerscript.ld main.cpp
>>
>> where linkerscript.ld is
>>
>> ----
>> SECTIONS
>> {
>> tests (READONLY) : {
>> PROVIDE(tests_begin = .);
>> KEEP(*(.tmptests))
>> PROVIDE(tests_end = .);
>> }
>> }
>> INSERT AFTER .text;
>> ----
>>
>>
>> And one example in msvc
>>
>> ----
>> #include <cstdio>
>> #include <span>
>> #include <iostream>
>>
>> using test_signature = void();
>>
>> #pragma comment(linker, "/merge:tests=.rdata") // from comments from
>> https://learn.microsoft.com/en-us/archive/blogs/larryosterman/when-i-
>> moved-my-code-into-a-library-what-happened-to-my-atl-com-objects
>> #pragma section("tests$a", read)
>> #pragma section("tests$b", read)
>> #pragma section("tests$c", read)
>>
>>
>> #define CONCAT_IMPL(x, y) x##y
>> #define CONCAT(x, y) CONCAT_IMPL(x, y)
>>
>> // NOTE: needs extern, or msvc optimizes out, gcc had similar issue
>> but fixed with attribute
>> // check OBJECT_ENTRY_PRAGMA
>> #define REGISTER_FUN(name) \
>> void name(); \
>> extern __declspec(allocate("tests$b")) constexpr auto CONCAT(helper,
>> __LINE__) = &name; \
>> void name()
>>
>> REGISTER_FUN(test1){std::puts("test1");}
>> REGISTER_FUN(test2){std::puts("test2");}
>>
>> std::span<test_signature*> get_tests() noexcept {
>> __declspec(allocate("tests$a")) static constinit test_signature*
>> tests_begin = nullptr;
>> __declspec(allocate("tests$c")) static constinit test_signature*
>> tests_end = nullptr;
>> const auto tests_size = ((uintptr_t)(&tests_end) - (uintptr_t)
>> (&tests_begin))/sizeof(test_signature*);
>> auto begin = &tests_begin;
>> return std::span(begin, begin + tests_size);
>> }
>>
>> int main(void)
>> {
>> auto funcs = get_tests();
>> for(const auto& v : funcs){
>> if(v){
>> v();
>> };
>> }
>> }
>> ----
>>
>> https://godbolt.org/z/b5c7fE19j
>>
>>
>> In both cases, the macro is only there for brevity.
>>
>> 1) Why would I like to have it standardized?
>>
>>
>> a) Because the implementation is very error-prone, as there is no
>> array to iterate over, requiring the usage of inline assembly for gcc,
>> and I am not sure what for msvc.
>>
>> I believe that for some cases start_lifetime_as_array could be used,
>> but it "only" works for cv void*, thus function pointers and member
>> functions pointer are potentially left out.
>>
>> b) uintptr_t might not be available, but iterating over an array
>> should always be possible
>>
>> c) the approach with GCC requires to modify the build environment, and
>> might create invalid binary files (I was at least able to create
>> binaries that run successfully, but tools like nm and objdump where
>> not able to analyze), opening a whole new can of worms
>>
>> d) the msvc implementation might have gaps between element, this gap
>> have value 0, thus it is not possible to store some types
>>
>> e) (AFAIK) the only alternative available in c++ is to allocate
>> memory, something like
>>
>> ----
>> constinit std::vector<test_signature*> tests;
>>
>> #define REGISTER_FUN(name) \
>> static void name(); \
>> int CONCAT(impl, __LINE__) = (tests.push_back(&name),0); \
>> static void name()
>>
>>
>> REGISTER_FUN(test1){std::puts("test1");}
>> REGISTER_FUN(test2){std::puts("test1");}
>> ----
>>
>>
>> but it requires to allocate memory, which is problematic in some
>> environments, and uses mutable memory (see EncodePointer/DecodePointer
>> of msvc).
>>
>> Using an array with maximum length still requires mutable memory, and
>> it could be not possible to determine a sensible maximum length.
>>
>>
>> 2) Why should it be standardized? / What are the use-cases?
>>
>> A similar technique has been used in the MFC framework and linux
>> kernel (currently missing sources).
>>
>> Main use-cases would be plugin systems, where code can register hooks,
>> and test suites (if you have ever used the catch test suite, rename
>> REGISTER_FUN to TEST_CASE and we have something that can be used for
>> registering at compile-time test functions).
>>
>> 3) What features are expected by standardizing a way to create an
>> array of elements from multiple translation units
>>
>> a) no change to the build system
>> b) works with all types
>> c) array is initialized at compile-time, even if it cannot be used in
>> constexpr context
>> d) less error-prone than using linker scripts
>> e) no assembly, casts, extern, or start_lifetime* required
>> f) array is const-correct
>> g) no memory allocation
>>
>>
>> 3) Why did I not write a paper?
>>
>> a) first, I wanted to gther some feedback. Maybe something similar has
>> already been proposed and rejected.
>>
>> b) It requires core wording changes I currently do not know how to
>> express.
>>
>>
>> Let me know what you think
>>
>> Best
>>
>> Federico
Received on 2024-10-17 20:42:40