Date: Fri, 4 Aug 2023 22:45:41 -0400
On Fri, Aug 4, 2023 at 7:02 PM Jan Schultke via Std-Discussion
<std-discussion_at_[hidden]> wrote:
>
> The issue starts with the following code gen: https://godbolt.org/z/PYTohPTKr
>
> Here, adding `__restrict` improves the code gen, but it doesn't for
> GCC. GCC seems to assume that no aliasing can take place between
> `this` and other parameters passed into the function. LLVM does not
> add `noalias` without `__restrict`.
>
> [class.cdtor] p2 says:
> > During the construction of an object, if the value of the
> > object or any of its subobjects is accessed through a
> > glvalue that is not obtained, directly or indirectly,
> > from the constructor's this pointer, the value
> > of the object or subobject thus obtained is unspecified.
>
> I believe that this paragraph's intent is to disallow aliasing, e.g.
> between `this` and rvalue reference passed into the move constructor.
> However, the wording is very difficult to implement.
> - GCC simply assumes `noalias`, which also assumes that writing to the
> aforementioned glvlaue will make its value unspecified. The paragraph
> only talks about the obtained value though.
> - clang does not assume anything, and possibly misses optimizations as a result.
>
> Am I interpreting the intent correctly? If so, how can the wording be
> improved to fully disallow aliasing, as intended, so that we can just
> add `noalias` here, as is already done?
I was complaining about this clause a while back; "the value... thus
obtained is unspecified" is a remarkably weak criterion compared to
the full UB of 'noalias' or '__restrict'. You can find the thread at
<https://lists.isocpp.org/std-discussion/2022/12/1952.php>.
Originally, the clause only applied to const objects under
construction, but it was later changed to apply to all objects. I
wasn't able to find any definite rationale of why the original clause
was added or why it was expanded; one can speculate that it was to
allow compilers to blanket-substitute 'global_const_object.field' with
its final computed value, but that doesn't work when the value can
also change in the destructor. Unless it's meant to help classes with
nontrivial constructors but trivial destructors?
Regardless, I don't think applying '__restrict' to every implicit
'this' parameter would even be desirable here. I have an example at
<https://github.com/cplusplus/CWG/issues/206> of how it would break
classes holding self-referential pointers: suppose that derived
classes B and C share a virtual base class A, and A's constructor
stores a pointer to one of its fields in another field. When A's
constructor is called from B's constructor, it stores a pointer
derived from A's 'this' pointer, which is derived from B's 'this'
pointer. But if C's constructor then tries to access the field (which
is an indirect subobject of C) via the pointer (derived from B's
'this'), then it breaks the rules of the clause. If the constructor
also modifies the pointee directly, with an lvalue derived from C's
'this', then it also breaks the rules of '__restrict'.
Personally, I'd be in favor of removing the clause entirely, if the
compiler writers are misinterpreting it as implying '__restrict'. It
can't be strengthened into a full '__restrict' without resulting in
the self-referential issue, at least not without a guarantee that the
annotation only applies to the outermost constructor and not to
subobject constructors.
<std-discussion_at_[hidden]> wrote:
>
> The issue starts with the following code gen: https://godbolt.org/z/PYTohPTKr
>
> Here, adding `__restrict` improves the code gen, but it doesn't for
> GCC. GCC seems to assume that no aliasing can take place between
> `this` and other parameters passed into the function. LLVM does not
> add `noalias` without `__restrict`.
>
> [class.cdtor] p2 says:
> > During the construction of an object, if the value of the
> > object or any of its subobjects is accessed through a
> > glvalue that is not obtained, directly or indirectly,
> > from the constructor's this pointer, the value
> > of the object or subobject thus obtained is unspecified.
>
> I believe that this paragraph's intent is to disallow aliasing, e.g.
> between `this` and rvalue reference passed into the move constructor.
> However, the wording is very difficult to implement.
> - GCC simply assumes `noalias`, which also assumes that writing to the
> aforementioned glvlaue will make its value unspecified. The paragraph
> only talks about the obtained value though.
> - clang does not assume anything, and possibly misses optimizations as a result.
>
> Am I interpreting the intent correctly? If so, how can the wording be
> improved to fully disallow aliasing, as intended, so that we can just
> add `noalias` here, as is already done?
I was complaining about this clause a while back; "the value... thus
obtained is unspecified" is a remarkably weak criterion compared to
the full UB of 'noalias' or '__restrict'. You can find the thread at
<https://lists.isocpp.org/std-discussion/2022/12/1952.php>.
Originally, the clause only applied to const objects under
construction, but it was later changed to apply to all objects. I
wasn't able to find any definite rationale of why the original clause
was added or why it was expanded; one can speculate that it was to
allow compilers to blanket-substitute 'global_const_object.field' with
its final computed value, but that doesn't work when the value can
also change in the destructor. Unless it's meant to help classes with
nontrivial constructors but trivial destructors?
Regardless, I don't think applying '__restrict' to every implicit
'this' parameter would even be desirable here. I have an example at
<https://github.com/cplusplus/CWG/issues/206> of how it would break
classes holding self-referential pointers: suppose that derived
classes B and C share a virtual base class A, and A's constructor
stores a pointer to one of its fields in another field. When A's
constructor is called from B's constructor, it stores a pointer
derived from A's 'this' pointer, which is derived from B's 'this'
pointer. But if C's constructor then tries to access the field (which
is an indirect subobject of C) via the pointer (derived from B's
'this'), then it breaks the rules of the clause. If the constructor
also modifies the pointee directly, with an lvalue derived from C's
'this', then it also breaks the rules of '__restrict'.
Personally, I'd be in favor of removing the clause entirely, if the
compiler writers are misinterpreting it as implying '__restrict'. It
can't be strengthened into a full '__restrict' without resulting in
the self-referential issue, at least not without a guarantee that the
annotation only applies to the outermost constructor and not to
subobject constructors.
Received on 2023-08-05 02:45:54