C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Suggestion: non-static member variables for static-duration-only classes

From: Julien Villemure-Fréchette <julien.villemure_at_[hidden]>
Date: Fri, 03 Oct 2025 19:21:27 -0400
> Response: I'm working on alternative to std::shared_mutex. This class is hard to implement, without multiple threads doing RMW on the same atomic variable. RMW requires cache line locking between CPU cores, which is slow. The core stall for RMW increases more than linearly as the number of cores increases. I'm trying an approach where each thread has a dedicated seek/holding mutex flag, which only it write. Thus, each mutex instance needs a dedicated thread_local variable to hold these flags.

AFAIK, I have never heard of such design techniques that would rely on synchronization through entities which are distinct in each thread, IMHO this does not seem possible and certainly doesn't scale as more threads gets created, as creating thread locals objects has overhead. Also, reading and writing to thread locals do not imply any memory ordering or synchronization (release/acquire or 'synchronizes with' relationships), which is mandatory for establishing synchronization between threads. Also, since each thread has its own copy of a thread local var, how would you observe writes to that var in other threads that would read the value? Passing around pointers to thread local data between threads seem unmanageably complex, you would need to build up a container of such pointers and would itself need to be guarded with a mutex; plus IIRC, that might not even work because some platform page thread local storage on distinct physical pages for each thread. But again, since write/read through thread locals do not establish a 'synchronizes with' relationship, designing a synchronization primitive around thread local seems DOA.

PS: cache line locking between CPU cores is essentially how the hardware implements inter thread communication, and AFAIK there's no other way to do that.

Julien V.




-------- Original Message --------
From: Walt Karas via Std-Proposals <std-proposals_at_[hidden]>
Sent: October 3, 2025 3:00:05 p.m. EDT
To: "std-proposals_at_[hidden]" <std-proposals_at_[hidden]>, Arthur O'Dwyer <arthur.j.odwyer_at_[hidden]>
Cc: Walt Karas <wkaras_at_[hidden]>
Subject: Re: [std-proposals] Suggestion: non-static member variables for static-duration-only classes

On Friday, October 3, 2025 at 02:13:05 PM EDT, Arthur O'Dwyer <arthur.j.odwyer_at_[hidden]> wrote:

On Thu, Oct 2, 2025 at 4:59 PM Walt Karas via Std-Proposals <std-proposals_at_[hidden]> wrote:
> I suggest allowing non-static thread_local member variables for classes, which would implicitly restrict instances of the class to have static storage duration.

At first glance, I thought this made some sense. Consider that we are currently allowed to write collections of related objects like this:
int g_i;
thread_local int g_j;
void f() {
static int s_i;
static thread_local int s_j;
}
And yet C++ doesn't currently allow us to package these related objects up into a struct:
struct Package { int i; thread_local int j; };
Package g;
void f() {
static Package s;
}

However, after more thought and experimentation, I think the counter-argument is that we are also allowed to write:
void g() {
int p_m;
static int p_n;
}
and yet that's not equivalent to
struct Package2 { int m; static int n; };
void g() {
Package2 p;
}
(Specifically, the initialization around `n` differs.) So why should it work any better if we try to replace the `static` keyword with the `thread_local` keyword? Or `inline`, or `register`, or...

Thread_locals have weird initialization rules — even weirder than statics/globals. If you were allowed to write
struct Package { int i; thread_local int j; }
then you might expect that the constructor of `Package p` should be responsible for constructing both `p.i` and `p.j`. But that can't be true, because the constructor of `p` is called only once, and `p.j` needs to be initialized as many times as you have threads. So that can't possibly work.

Consider also that thread_local variables already can have their initializers skipped over:
// https://godbolt.org/z/6cbj63YzP
int main() {
thread_local std::string s = "hello world";
std::thread t([]() {
// when used here, s is uninitialized
printf("%s\n", s.c_str());
});
t.join();
}

And they invariably have their destructors delayed to the end of the thread, instead of being destroyed in reverse order of construction (Godbolt). So they don't behave in an RAII fashion. It would be nice if they behaved better, but I'm not sure that's physically possible. Allowing them to "live inside" classes and RAII types might be a moral hazard, by implying that they work better than in fact they do.

Response: I think these points are generally addressed by restricting non-static thread_local member variables to objects with program lifetime, no?

If I had my druthers, C++ would never have standardized a `thread_local` keyword to begin with. I doubt it's possible to use that keyword safely or portably, even 14 years after it was invented.

Response: It would be more flexible if the Standard specified a pthread-like thread-local-storage interface in Standard Library, and let you go with God and void pointer casting. But that would be more tedious and error-prone.

> This would have been nice to have for a class I'm currently working on. The only work-around I could to come up with was [...]

FYI, you never explained what this class was or how thread_locals would have been useful to it. I bet there was a better way to do what you were trying to do; and I bet the `thread_local` keyword was the wrong tool, anyway.

Response: I'm working on alternative to std::shared_mutex. This class is hard to implement, without multiple threads doing RMW on the same atomic variable. RMW requires cache line locking between CPU cores, which is slow. The core stall for RMW increases more than linearly as the number of cores increases. I'm trying an approach where each thread has a dedicated seek/holding mutex flag, which only it write. Thus, each mutex instance needs a dedicated thread_local variable to hold these flags. What I have so far is at https://github.com/wkaras/C-plus-plus-intrusive-container-templates/tree/ru_shared_mutex, files matching the pattern *ru_* . (Use your biggest air tank if you're taking the dive.)

–Arthur

Does Dr. Stroustrup hold the money when we make bets in this forum?

-- 
Std-Proposals mailing list
Std-Proposals_at_[hidden]
https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
-- Julien Villemure
Sent from my Android device with K-9 Mail.

Received on 2025-10-03 23:21:43