C++ Logo

std-discussion

Advanced search

Re: Global array of objects over multiple files

From: Lénárd Szolnoki <cpp_at_[hidden]>
Date: Thu, 17 Oct 2024 18:35:38 +0100
What happens when dynamic loading gets involved?

My impression is that as presented it's only implementable for static linking, is that right? Note that on *nix shared objects are generally treated as any other TUs of the same program. Could this property be kept with the addition of a language feature like this, or such arrays would necessarily have hidden visibility implicitly?


On 17 October 2024 15:58:04 BST, Federico Kircheis via Std-Discussion <std-discussion_at_[hidden]> wrote:
>Hello,
>
>I would like to have, in c++, a way to "register" objects from different translation units in a global container (an array).
>
>This is already possible in gcc and msvc with tools outside of the language, with multiple drawbacks.
>
>To make it more clear what I am talking about, here are one example with gcc
>
>----
>#include <cstdio>
>#include <cstdint>
>#include <span>
>
>using test_signature = void();
>
>#define CONCAT_IMPL(x, y) x##y
>#define CONCAT(x, y) CONCAT_IMPL(x, y)
>#define REGISTER_FUN(name) \
> void name();\
> [[gnu::used]] constexpr auto CONCAT(helper, __LINE__) [[gnu::section(".tmptests")]] = &name; \
> void name()
>
>REGISTER_FUN(test1){std::puts("test1");}
>REGISTER_FUN(test2){std::puts("test1");}
>
>std::span<test_signature*> get_tests() noexcept {
> extern test_signature* tests_begin[];
> extern test_signature* tests_end[];
> const auto tests_size = ((uintptr_t)(tests_end) - (uintptr_t)(tests_begin))/sizeof(test_signature*);
> test_signature** begin = tests_begin;
> asm("":"+r"(begin));
> return std::span<test_signature*>(begin, begin + tests_size);
>}
>
>int main(){
> auto funcs = get_tests();
> for(const auto& v : funcs){
> v();
> }
>}
>----
>
>compile and execute with
>
>gcc --std=c++20 -Wl,-Tlinkerscript.ld main.cpp
>
>where linkerscript.ld is
>
>----
>SECTIONS
>{
> tests (READONLY) : {
> PROVIDE(tests_begin = .);
> KEEP(*(.tmptests))
> PROVIDE(tests_end = .);
> }
>}
>INSERT AFTER .text;
>----
>
>
>And one example in msvc
>
>----
>#include <cstdio>
>#include <span>
>#include <iostream>
>
>using test_signature = void();
>
>#pragma comment(linker, "/merge:tests=.rdata") // from comments from https://learn.microsoft.com/en-us/archive/blogs/larryosterman/when-i-moved-my-code-into-a-library-what-happened-to-my-atl-com-objects
>#pragma section("tests$a", read)
>#pragma section("tests$b", read)
>#pragma section("tests$c", read)
>
>
>#define CONCAT_IMPL(x, y) x##y
>#define CONCAT(x, y) CONCAT_IMPL(x, y)
>
>// NOTE: needs extern, or msvc optimizes out, gcc had similar issue but fixed with attribute
>// check OBJECT_ENTRY_PRAGMA
>#define REGISTER_FUN(name) \
> void name(); \
> extern __declspec(allocate("tests$b")) constexpr auto CONCAT(helper, __LINE__) = &name; \
> void name()
>
>REGISTER_FUN(test1){std::puts("test1");}
>REGISTER_FUN(test2){std::puts("test2");}
>
>std::span<test_signature*> get_tests() noexcept {
> __declspec(allocate("tests$a")) static constinit test_signature* tests_begin = nullptr;
> __declspec(allocate("tests$c")) static constinit test_signature* tests_end = nullptr;
> const auto tests_size = ((uintptr_t)(&tests_end) - (uintptr_t)(&tests_begin))/sizeof(test_signature*);
> auto begin = &tests_begin;
> return std::span(begin, begin + tests_size);
>}
>
>int main(void)
>{
> auto funcs = get_tests();
> for(const auto& v : funcs){
> if(v){
> v();
> };
> }
>}
>----
>
>https://godbolt.org/z/b5c7fE19j
>
>
>In both cases, the macro is only there for brevity.
>
>1) Why would I like to have it standardized?
>
>
>a) Because the implementation is very error-prone, as there is no array to iterate over, requiring the usage of inline assembly for gcc, and I am not sure what for msvc.
>
>I believe that for some cases start_lifetime_as_array could be used, but it "only" works for cv void*, thus function pointers and member functions pointer are potentially left out.
>
>b) uintptr_t might not be available, but iterating over an array should always be possible
>
>c) the approach with GCC requires to modify the build environment, and might create invalid binary files (I was at least able to create binaries that run successfully, but tools like nm and objdump where not able to analyze), opening a whole new can of worms
>
>d) the msvc implementation might have gaps between element, this gap have value 0, thus it is not possible to store some types
>
>e) (AFAIK) the only alternative available in c++ is to allocate memory, something like
>
>----
>constinit std::vector<test_signature*> tests;
>
>#define REGISTER_FUN(name) \
> static void name(); \
> int CONCAT(impl, __LINE__) = (tests.push_back(&name),0); \
> static void name()
>
>
>REGISTER_FUN(test1){std::puts("test1");}
>REGISTER_FUN(test2){std::puts("test1");}
>----
>
>
>but it requires to allocate memory, which is problematic in some environments, and uses mutable memory (see EncodePointer/DecodePointer of msvc).
>
>Using an array with maximum length still requires mutable memory, and it could be not possible to determine a sensible maximum length.
>
>
>2) Why should it be standardized? / What are the use-cases?
>
>A similar technique has been used in the MFC framework and linux kernel (currently missing sources).
>
>Main use-cases would be plugin systems, where code can register hooks, and test suites (if you have ever used the catch test suite, rename REGISTER_FUN to TEST_CASE and we have something that can be used for registering at compile-time test functions).
>
>3) What features are expected by standardizing a way to create an array of elements from multiple translation units
>
>a) no change to the build system
>b) works with all types
>c) array is initialized at compile-time, even if it cannot be used in constexpr context
>d) less error-prone than using linker scripts
>e) no assembly, casts, extern, or start_lifetime* required
>f) array is const-correct
>g) no memory allocation
>
>
>3) Why did I not write a paper?
>
>a) first, I wanted to gther some feedback. Maybe something similar has already been proposed and rejected.
>
>b) It requires core wording changes I currently do not know how to express.
>
>
>Let me know what you think
>
>Best
>
>Federico

Received on 2024-10-17 17:35:42