Date: Wed, 12 Apr 2023 21:05:43 +0200
On 11/04/2023 00.30, Hans Boehm wrote:
> Agreed, but where is the restriction of plain memcpy() to trivially copyable types?
Here:
[basic.types.general] p3
The question is not whether you can apply memcpy (you always can do that),
but the issue is whether the result of the copy is an "alive" object,
or just a bunch of bytes that you scribbled into memory.
For non-trivially-copyable types, there is no guarantee the result of
the copy yields a usable object.
The "seqlock" example needs "alive" objects after the copy.
> I would suggest copying that, but I can't find it. The main definition is in [cstring,syn], which refers to the C standard.
See above; maybe the library wording should refer to "as if by memcpy" with
a cross-reference to [basic.types.general].
> This also made me worry about atomic_ref C++, but I think that's worded to make it clear that would introduce a data race, so we're probably OK there, too.
Yes, I think so.
> Maybe there should be an explicit note that the only concurrent conflicting accesses would be from the other function in this section?
I don't understand what you're trying to say here.
> I guess the way to fix this now is via a national body comment on the TS. But the time window for that is not open yet. And possibly a C++23 issue for plain memcpy?
memcpy is covered; see above.
Jens
> Hans
>
>
> On Thu, Apr 6, 2023 at 8:54 AM Jens Maurer <jens.maurer_at_[hidden] <mailto:jens.maurer_at_[hidden]>> wrote:
>
>
>
> On 06/04/2023 17.10, Andy via Std-Proposals wrote:
> > P1478 suggests that the added atomic_{load,store}_per_byte_memcpy can be implemented without accessing each byte individually
> >
> >> Note that on standard hardware, it should be OK to actually perform the copy at larger than byte granularity. Copying multiple bytes as part of one operation is indistinguishable from running them so quickly that the intermediate state is not observed. In fact, we expect that existing assembly memcpy implementations will suffice when suffixed with the required fence.
> >
> > A Rust RFC was made to mirror the same primitives in Rust, which also suggested that the implementation is not restricted to byte-sized accesses. The question of mixed size access was raised: https://github.com/rust-lang/rfcs/pull/3301#pullrequestreview-1082913781 <https://github.com/rust-lang/rfcs/pull/3301#pullrequestreview-1082913781> <https://github.com/rust-lang/rfcs/pull/3301#pullrequestreview-1082913781 <https://github.com/rust-lang/rfcs/pull/3301#pullrequestreview-1082913781>>, which appears to apply to P1478 as well but was not discussed in the proposal.
> >
> > Since the actual access granularity of atomic memcpy is implementation-defined and opaque to the user, the user cannot guarantee that the same granularity is used when they access the same address using something other than atomic_{load,store}_per_byte_memcpy, or with a different count argument. It may be possible to guarantee a consistent granularity for each type, but since the proposed API takes void*, this isn’t possible.
> >
> > For instance, it's valid to have
> >
> > #include <atomic>
> >
> >
> > struct Foo {
> > std::atomic_int32_t a;
> > };
> >
> > void thread1(Foo& foo) {
> > foo.a.load(std::memory_order_acquire);
> > }
> >
> > void thread2(Foo& foo) {
> > Foo foo_copy;
> > std::atomic_load_per_byte_memcpy(&foo_copy, &foo, sizeof(Foo), std::memory_order_acquire);
> > }
> >
> > Resulting in the same address being accessed by two threads with potentially differently sized atomic operations, without order.
> >
> > The proposal seems to suggest that this is fine,
>
> No. "Foo" is not trivially copyable (because a std::atomic_int32_t is not trivially copyable),
> thus you can't apply memcpy to it. Note the comment in the example in the paper:
>
> Foo data; // Trivially copyable.
>
> This requirement seems to be missing from the wording, though.
>
> Jens
>
> Agreed, but where is the restriction of plain memcpy() to trivially copyable types?
Here:
[basic.types.general] p3
The question is not whether you can apply memcpy (you always can do that),
but the issue is whether the result of the copy is an "alive" object,
or just a bunch of bytes that you scribbled into memory.
For non-trivially-copyable types, there is no guarantee the result of
the copy yields a usable object.
The "seqlock" example needs "alive" objects after the copy.
> I would suggest copying that, but I can't find it. The main definition is in [cstring,syn], which refers to the C standard.
See above; maybe the library wording should refer to "as if by memcpy" with
a cross-reference to [basic.types.general].
> This also made me worry about atomic_ref C++, but I think that's worded to make it clear that would introduce a data race, so we're probably OK there, too.
Yes, I think so.
> Maybe there should be an explicit note that the only concurrent conflicting accesses would be from the other function in this section?
I don't understand what you're trying to say here.
> I guess the way to fix this now is via a national body comment on the TS. But the time window for that is not open yet. And possibly a C++23 issue for plain memcpy?
memcpy is covered; see above.
Jens
> Hans
>
>
> On Thu, Apr 6, 2023 at 8:54 AM Jens Maurer <jens.maurer_at_[hidden] <mailto:jens.maurer_at_[hidden]>> wrote:
>
>
>
> On 06/04/2023 17.10, Andy via Std-Proposals wrote:
> > P1478 suggests that the added atomic_{load,store}_per_byte_memcpy can be implemented without accessing each byte individually
> >
> >> Note that on standard hardware, it should be OK to actually perform the copy at larger than byte granularity. Copying multiple bytes as part of one operation is indistinguishable from running them so quickly that the intermediate state is not observed. In fact, we expect that existing assembly memcpy implementations will suffice when suffixed with the required fence.
> >
> > A Rust RFC was made to mirror the same primitives in Rust, which also suggested that the implementation is not restricted to byte-sized accesses. The question of mixed size access was raised: https://github.com/rust-lang/rfcs/pull/3301#pullrequestreview-1082913781 <https://github.com/rust-lang/rfcs/pull/3301#pullrequestreview-1082913781> <https://github.com/rust-lang/rfcs/pull/3301#pullrequestreview-1082913781 <https://github.com/rust-lang/rfcs/pull/3301#pullrequestreview-1082913781>>, which appears to apply to P1478 as well but was not discussed in the proposal.
> >
> > Since the actual access granularity of atomic memcpy is implementation-defined and opaque to the user, the user cannot guarantee that the same granularity is used when they access the same address using something other than atomic_{load,store}_per_byte_memcpy, or with a different count argument. It may be possible to guarantee a consistent granularity for each type, but since the proposed API takes void*, this isn’t possible.
> >
> > For instance, it's valid to have
> >
> > #include <atomic>
> >
> >
> > struct Foo {
> > std::atomic_int32_t a;
> > };
> >
> > void thread1(Foo& foo) {
> > foo.a.load(std::memory_order_acquire);
> > }
> >
> > void thread2(Foo& foo) {
> > Foo foo_copy;
> > std::atomic_load_per_byte_memcpy(&foo_copy, &foo, sizeof(Foo), std::memory_order_acquire);
> > }
> >
> > Resulting in the same address being accessed by two threads with potentially differently sized atomic operations, without order.
> >
> > The proposal seems to suggest that this is fine,
>
> No. "Foo" is not trivially copyable (because a std::atomic_int32_t is not trivially copyable),
> thus you can't apply memcpy to it. Note the comment in the example in the paper:
>
> Foo data; // Trivially copyable.
>
> This requirement seems to be missing from the wording, though.
>
> Jens
>
Received on 2023-04-12 19:05:48