Date: Wed, 10 Feb 2021 14:11:26 +0000
> I don't quite understand the explanation though. Is the reason that there is
> no way to get access to a global lock shard for the whole system so locks
> need to be embedded so that it works for shared memory across processes?
In the Windows ecosystem, you can have multiple versions of the C and C++ runtimes loaded in a process. Many of the runtimes are not "owned" by the operating system, and upgrade independently from each other. The straightforward implementation of a "global" lock shard would end up with a per-runtime lock shard instead. So long as an atomic variable stays in code that uses a single runtime, that would be fine. As soon as the atomic gets used in a different part of the process though, you would end up with two different locks being used for the same piece of memory. By carrying the lock along with the type, you avoid this problem.
It's similar to the libcu++ case, except instead of synchronizing between GPU and CPU, you are synchronizing between runtime 1 and runtime 2.
> no way to get access to a global lock shard for the whole system so locks
> need to be embedded so that it works for shared memory across processes?
In the Windows ecosystem, you can have multiple versions of the C and C++ runtimes loaded in a process. Many of the runtimes are not "owned" by the operating system, and upgrade independently from each other. The straightforward implementation of a "global" lock shard would end up with a per-runtime lock shard instead. So long as an atomic variable stays in code that uses a single runtime, that would be fine. As soon as the atomic gets used in a different part of the process though, you would end up with two different locks being used for the same piece of memory. By carrying the lock along with the type, you avoid this problem.
It's similar to the libcu++ case, except instead of synchronizing between GPU and CPU, you are synchronizing between runtime 1 and runtime 2.
Received on 2021-02-10 08:11:35