C++ Logo

std-proposals

Advanced search

Re: RFC: disjoint qualifier

From: Eric Lengyel <lengyel_at_[hidden]>
Date: Fri, 18 Sep 2020 01:02:11 -0700
I think you misunderstand the intent of the disjoint proposal. It's not meant to involve any kind of proof of disjoint-ness, and it certainly is not meant to resemble Rust's borrow checker. It is simply a tool for the programmer to use at their own discretion (or peril -- this is C++ after all).

The disjoint qualifier allows intelligent use of the type system, but it is no safer than the existing const qualifier. They have very symmetric affects on a program, in fact. Consider this function:

void foo(const X& x);

This is a promise made by the function foo() to the caller that the contents of x won't be changed inside the function foo(), but it's not absolute proof. Nothing is stopping somebody from making bad design choices and putting code inside foo() that changes the contents of x anyway. How far can the compiler go when it takes the const-ness of x into account? Can it move a load from x that happens before foo() is called to a time after foo() is called under the assumption that foo() didn't change it?

Conversely, consider this function:

void bar(disjoint X& x);

This is a promise made by the caller to the function foo(), in the opposite sense of the const qualifier, that the contents of x are not aliased. But again, it's not absolute proof. We want the compiler to be able to safely make the assumption that x is not aliased inside foo(), but it requires that the programmer use the disjoint qualifier properly on the calling side. Of course, programmers who don't know how to use a tool could hurt themselves with it. If they don't use disjoint correctly, their programs are buggy, just as with a million other things in C++.

> If you are able to declare a variable disjoint, but it has to be used in a context where it actually isn't, then it would be unsafe for the compiler to actually act on that keyword in any form.

Any context in which it's not safe for the compiler to act on the disjoint keyword would not actually include the disjoint keyword. Objects that were originally disjoint would implicitly lose their disjoint qualifiers in such contexts (e.g., passing a disjoint object to a function that takes a non-disjoint parameter), and everything would be fine.

> Think e.g. good old `memcpy` vs `memmov` problem. Except with restrict (or __restrict which many compilers do support in C++ syntax) at least you had to aim before you shoot your own foot, with a propagating disjoint qualifier, that gun comes pre-aimed for you.

Disjoint is much safer than __restrict because you can't accidently pass pointers to not-disjoint memory to functions expecting non-aliasing address ranges. The __restrict version would silently accept pointers to aliasing memory and fail.

> You don't see pointers often anymore

*You* may not see pointers often anymore, but to generalize your personal experience to the whole world is incorrect. There are many companies where the style/convention is pointers-only. But it doesn't matter because all the various wrappers that could encapsulate a pointer to X or a pointer to const X could just as easily encapsulate a pointer to disjoint X.

> As a_end can be reached by incrementing a_begin, each two parameters here are not just "not disjoined", but even required to form a single range or even be identical as the only way to encode an empty range.

This is not a problem because code would never actually access the object referenced by a_end. Disjoint-ness is violated only if it's possible to change the output of a program by reordering reads and writes to disjoint objects.

This proposal is not attempting to prohibit or prevent the existence of two pointers to the same disjoint object. It's giving the programmer a new tool that can be used to tell the compiler "I know what I'm doing, and I promise that these specific objects will not be accessed in a way for which possible aliasing must be assumed." If the programmer is wrong when making such a statement to the compiler, then it's their fault if things don't work properly.


-----Original Message-----
From: Std-Proposals <std-proposals-bounces_at_[hidden]> On Behalf Of Andreas Ringlstetter via Std-Proposals
Sent: Friday, September 18, 2020 12:00 AM
To: Std-Proposals <std-proposals_at_[hidden]>
Cc: Andreas Ringlstetter <andreas.ringlstetter_at_[hidden]>
Subject: Re: [std-proposals] RFC: disjoint qualifier

How is that supposed to work?

Take e.g.:
   struct foo {}
    struct bar {
        disjoint foo *foobar;
    }
    bar a;
    bar& b = a;
    function_requiring_disjoint_pointers(&a.foobar, &b.foobar);

Proving that the two parameters had been disjoint (or actually not, as in this case) is a non-trivial task. Furthermore, it's not actually possible to decide this solely based on a type qualifier, unless that qualifier had *forbidden* any operation which could have destroyed the restricted property. Meaning there must never have been more than a single reference to the storage in existence (think e.g.
std::unique_ptr, but allowing move only, no reference to it), and there must not be any owner to which more than a single reference can exist.

Your motivation appears to resemble the borrow rules from the Rust language, but you have to be aware that this only works if unique references are enforced throughout the whole application top down. You can't just take a type somewhere in nested data structure and declare "you are now without aliases". E.g. as the above example showed, there mustn't have been an instance of `bar` without a disjoint qualifier (as it had qualified members), the second reference on `a` mustn't be permitted etc.

Without enforcement, this actually turns dangerous instead. If you are able to declare a variable disjoint, but it has to be used in a context where it actually isn't, then it would be unsafe for the compiler to actually act on that keyword in any form. Think e.g. good old `memcpy` vs `memmov` problem. Except with restrict (or __restrict which many compilers do support in C++ syntax) at least you had to aim before you shoot your own foot, with a propagating disjoint qualifier, that gun comes pre-aimed for you.

Then there is also another reason why `restrict` has such limited usage in the C++ language: You don't see pointers often anymore, and iterators (with the concept of an "past-the-end" iterator) are by design incompatible with any attempt to declare them as alias-free.

E.g. take:

  void foo_cpp(it a_begin, it a_end, it b_begin, it b_end);
  void foo_c(T * restrict a, size_t a_size, T * restrict b, size_t b_size);

As a_end can be reached by incrementing a_begin, each two parameters here are not just "not disjoined", but even required to form a single range or even be identical as the only way to encode an empty range.
Only in the C style interface, it's possible to express aliasing rules at all.

Am Fr., 18. Sept. 2020 um 03:38 Uhr schrieb Eric Lengyel via Std-Proposals <std-proposals_at_[hidden]>:
>
> Standard C++ does not include the “restrict” qualifier from C, and many serious problems arise when attempting to properly incorporate it into the C++ language. The following proposal introduces a superior alternative in the form of a “disjoint” qualifier that functions differently in several important ways. The disjoint qualifier is a clean and natural fit within the established design of the C++ language, and it adds type-safe aliasing controls in a backward-compatible manner.
>
>
>
> http://terathon.com/disjoint_lengyel.pdf
>
>
>
> Please discuss.
>
>
>
> -- Eric Lengyel
>
>
>
> --
> Std-Proposals mailing list
> Std-Proposals_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
--
Std-Proposals mailing list
Std-Proposals_at_[hidden]
https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals

Received on 2020-09-18 03:02:17