Date: Thu, 1 Jun 2023 10:42:21 -0400
On Thu, Jun 1, 2023 at 10:27 AM Phil Bouchard via Std-Proposals
<std-proposals_at_[hidden]> wrote:
>
>
>
> On 6/1/23 01:34, Thiago Macieira wrote:
> > On Wednesday, 31 May 2023 21:42:45 PDT Phil Bouchard wrote:
> >>> if (!container.empty())
> >>> container.push)back(1);
> >>>
> >>> is not safe.
> >>
> >> Why? The mutex is recursive so it just gives priority to the current
> >> thread for the scope of the condition.
> >
> > Because the container may have become non-empty between the check and the
> > push_back(1), in which case the code violated the requirement to append 1 only
> > if it was empty.
> >
> >>> And this of course becomes more complex when you have two or more
> >>> containers. Here's the first part of an algorithm: given containers C1
> >>> and C2, if the first element in C1 isn't in C2, then remove it from there
> >>> (pop_front) and append (push_bacl) to C2.
> >>>
> >>> If you realise what the second part will be, feel free to comment on it
> >>> and
> >>> explain how your API will ensure the user writes the proper code for that.
> >>
> >> // imagine both containers are locked here for the scope of the condition:
> >> if (find(C2.begin(), C2.end(), * C1.begin()) != C2.end())
> >> {
> >> T value = C1.front();
> >> C1.pop_front();
> >> C2.push_back(value);
> >> }
> >
> > Locked how? That's the interesting part.
> >
> > Are you suggesting that the compiler analyse the if-find line, find all lockable
> > elements in use there, then lock them behind the scenes without user action
> > and keep them locked until the end of the scope?
>
> Yes that's what I am trying to say since the beginning.
That is not possible in general. Some of that stuff can be hidden
behind function calls that are opaque to the compiler. It does not
know what gets called in that code, and since the object is shared, it
may not even be obvious whether the opaque function call can access
it.
Therefore, the compiler has two choices: don't automagically engage
the lock (and therefore potentially break code), or do automagically
engage the lock. But the latter can be pretty bad too; if the opaque
function doesn't use it, you now lock the container for an
indeterminate period of time. Especially since the opaque function can
have arbitrary amounts of code in it.
> Here is just an
> example of the net code the compiler would see after making temporary
> variables last for the duration of the conditional scope:
> https://github.com/philippeb8/std__ts/blob/master/ts.cpp#L23
>
> >> Again, where's the problem if both containers feature a recursive mutex?
> >
> > Recursion is not the problem. Atomicity of the action is. The point I tried to
> > make with the previous example is that one must reason about the duration of
> > the lock to decide when it must start and when it must end, so the conditions
> > don't change between statements. This is also the analogy of the transaction
> > that someone else posted in this thread.
> >
> > With the simple examples we're discussing here, it might be obvious what to do
> > and thus make solutions obvious. What others and I are telling you is that
> > when it gets to really complex thread-safe code, you *have* to reason about
> > when locks must start and when they must end, and what other locks you have.
> > Plus, reason about the order of locks, to avoid deadlocks.
> >
> > Thread-safety requires having the smallest possible critical sections, but no
> > smaller. If you pulverise your locks, you add overhead and actually lose
> > safety.
> >
> >> Regarding teaching, this is higher-level programming so a new namespace
> >> should encompass these new classes.
> >
> > That's not what I meant. You're oversimplifying the answer to a complex
> > question. Refer back to the top of this email:
> >
> > if (!container.empty())
> > container.push)back(1);
> >
> > This may have no data race and thus cause no data corruption, but it's not
> > what was required because the states may have changed. If this is still
> > allowed, then just using the container in question does *not* confer thread-
> > safety. And therefore, if it is allowed to compile, how do you propose we
> > teach everyone *how* to write code to actually make it thread-safe?
>
> You add some type trait allowing the compiler to determine whether the
> class is a "thread-safe" class or not.
This is what I like to call "bulldozer design". You declare that you
want X, and when people point out problems with achieving X, you take
a bulldozer to whatever those problems are. Every time someone points
out a problem, you say "oh, we'll add a niche solution to fix that."
You want a thread-safe container. Such containers have substantial
flaws that make using them in a thread-safe way difficult and
perilous. Therefore, every time someone shows such a circumstance, you
try to come up with some small language change to allow compilers to
fix it.
This kind of design often creates a bunch of tiny-yet-complex language
features that individually don't have any real meaning.
It's also a bad way to suggest a feature for standardization. If you
want to bulldozer your way to an idea, you need to figure out what the
problems are *yourself* and present a fully bulldozed path from here
to there. It's not our job to point out all of the houses and
buildings in the path of your bulldozer.
> > And how is that different from what we're already doing now?
>
> BTW forget my previous Github example, it is not generic enough. A
> thread-safe smart pointer (root_ptr or atomic<shared_ptr>) or the
> following wrapper with some type trait would be the way to go:
> https://fekir.info/post/sharing-data-between-threads/#_bind-the-data-and-mutex-together
>
> So for each temporary variable being a "thread_safe_type" object, the
> compiler would generate temporary variables lasting for the scope of the
> condition.
>
> But again we would need also new "thread-recursive" on top of
> "scope-recursive" mutices.
And how do you fix this:
```
container.push_back(1);
container.push_back(2);
```
Did the user expect these two operations to be atomic or not? That is,
does the user expect the outside world to observe these changes
individually or as a collective? You don't know and you *cannot* know.
These are distinct things and must be explicitly written distinctly.
Which means that users still have to pay attention when they share
containers.
The idea that you can write asynchronous code exactly like synchronous
code and the compiler will come along after you to fix all of your
problems is folly. It ain't happening. Not unless you do it in a
language and context that is expressly designed for that.
<std-proposals_at_[hidden]> wrote:
>
>
>
> On 6/1/23 01:34, Thiago Macieira wrote:
> > On Wednesday, 31 May 2023 21:42:45 PDT Phil Bouchard wrote:
> >>> if (!container.empty())
> >>> container.push)back(1);
> >>>
> >>> is not safe.
> >>
> >> Why? The mutex is recursive so it just gives priority to the current
> >> thread for the scope of the condition.
> >
> > Because the container may have become non-empty between the check and the
> > push_back(1), in which case the code violated the requirement to append 1 only
> > if it was empty.
> >
> >>> And this of course becomes more complex when you have two or more
> >>> containers. Here's the first part of an algorithm: given containers C1
> >>> and C2, if the first element in C1 isn't in C2, then remove it from there
> >>> (pop_front) and append (push_bacl) to C2.
> >>>
> >>> If you realise what the second part will be, feel free to comment on it
> >>> and
> >>> explain how your API will ensure the user writes the proper code for that.
> >>
> >> // imagine both containers are locked here for the scope of the condition:
> >> if (find(C2.begin(), C2.end(), * C1.begin()) != C2.end())
> >> {
> >> T value = C1.front();
> >> C1.pop_front();
> >> C2.push_back(value);
> >> }
> >
> > Locked how? That's the interesting part.
> >
> > Are you suggesting that the compiler analyse the if-find line, find all lockable
> > elements in use there, then lock them behind the scenes without user action
> > and keep them locked until the end of the scope?
>
> Yes that's what I am trying to say since the beginning.
That is not possible in general. Some of that stuff can be hidden
behind function calls that are opaque to the compiler. It does not
know what gets called in that code, and since the object is shared, it
may not even be obvious whether the opaque function call can access
it.
Therefore, the compiler has two choices: don't automagically engage
the lock (and therefore potentially break code), or do automagically
engage the lock. But the latter can be pretty bad too; if the opaque
function doesn't use it, you now lock the container for an
indeterminate period of time. Especially since the opaque function can
have arbitrary amounts of code in it.
> Here is just an
> example of the net code the compiler would see after making temporary
> variables last for the duration of the conditional scope:
> https://github.com/philippeb8/std__ts/blob/master/ts.cpp#L23
>
> >> Again, where's the problem if both containers feature a recursive mutex?
> >
> > Recursion is not the problem. Atomicity of the action is. The point I tried to
> > make with the previous example is that one must reason about the duration of
> > the lock to decide when it must start and when it must end, so the conditions
> > don't change between statements. This is also the analogy of the transaction
> > that someone else posted in this thread.
> >
> > With the simple examples we're discussing here, it might be obvious what to do
> > and thus make solutions obvious. What others and I are telling you is that
> > when it gets to really complex thread-safe code, you *have* to reason about
> > when locks must start and when they must end, and what other locks you have.
> > Plus, reason about the order of locks, to avoid deadlocks.
> >
> > Thread-safety requires having the smallest possible critical sections, but no
> > smaller. If you pulverise your locks, you add overhead and actually lose
> > safety.
> >
> >> Regarding teaching, this is higher-level programming so a new namespace
> >> should encompass these new classes.
> >
> > That's not what I meant. You're oversimplifying the answer to a complex
> > question. Refer back to the top of this email:
> >
> > if (!container.empty())
> > container.push)back(1);
> >
> > This may have no data race and thus cause no data corruption, but it's not
> > what was required because the states may have changed. If this is still
> > allowed, then just using the container in question does *not* confer thread-
> > safety. And therefore, if it is allowed to compile, how do you propose we
> > teach everyone *how* to write code to actually make it thread-safe?
>
> You add some type trait allowing the compiler to determine whether the
> class is a "thread-safe" class or not.
This is what I like to call "bulldozer design". You declare that you
want X, and when people point out problems with achieving X, you take
a bulldozer to whatever those problems are. Every time someone points
out a problem, you say "oh, we'll add a niche solution to fix that."
You want a thread-safe container. Such containers have substantial
flaws that make using them in a thread-safe way difficult and
perilous. Therefore, every time someone shows such a circumstance, you
try to come up with some small language change to allow compilers to
fix it.
This kind of design often creates a bunch of tiny-yet-complex language
features that individually don't have any real meaning.
It's also a bad way to suggest a feature for standardization. If you
want to bulldozer your way to an idea, you need to figure out what the
problems are *yourself* and present a fully bulldozed path from here
to there. It's not our job to point out all of the houses and
buildings in the path of your bulldozer.
> > And how is that different from what we're already doing now?
>
> BTW forget my previous Github example, it is not generic enough. A
> thread-safe smart pointer (root_ptr or atomic<shared_ptr>) or the
> following wrapper with some type trait would be the way to go:
> https://fekir.info/post/sharing-data-between-threads/#_bind-the-data-and-mutex-together
>
> So for each temporary variable being a "thread_safe_type" object, the
> compiler would generate temporary variables lasting for the scope of the
> condition.
>
> But again we would need also new "thread-recursive" on top of
> "scope-recursive" mutices.
And how do you fix this:
```
container.push_back(1);
container.push_back(2);
```
Did the user expect these two operations to be atomic or not? That is,
does the user expect the outside world to observe these changes
individually or as a collective? You don't know and you *cannot* know.
These are distinct things and must be explicitly written distinctly.
Which means that users still have to pay attention when they share
containers.
The idea that you can write asynchronous code exactly like synchronous
code and the compiler will come along after you to fix all of your
problems is folly. It ain't happening. Not unless you do it in a
language and context that is expressly designed for that.
Received on 2023-06-01 14:42:33