std-proposals: Re: RFC: disjoint qualifier

From: Eric Lengyel <lengyel_at_[hidden]>
Date: Wed, 23 Sep 2020 14:46:21 -0700

> If it is only about discipline, why do you need this need to be in C++?

It’s not about discipline at all. It’s about giving the compiler extra information that it can use to perform better optimizations. The addition of a qualifier to the type system does two things: (a) it allows proper overloading, and (b) it prevents the mistake of passing aliased storage to a function expecting disjoint storage, which is not prevented by restrict.

From: Std-Proposals <std-proposals-bounces_at_[hidden]> On Behalf Of Henry Miller via Std-Proposals
Sent: Wednesday, September 23, 2020 8:22 AM
To: Henry Miller via Std-Proposals <std-proposals_at_[hidden]>
Cc: Henry Miller <hank_at_[hidden]>
Subject: Re: [std-proposals] RFC: disjoint qualifier

On Tue, Sep 22, 2020, at 16:41, Eric Lengyel via Std-Proposals wrote:

> Don't assume. It can't help, but (if you assume wrong) it might hurt.

I was simply attempting to acknowledge the likelihood that you possess some familiarity with the subject material because many people seem to get offended if I explain it to them as if they didn’t already understand. So let me be a little more clear about my expectations from a competent professional. If you’d like to participate in a meaningful discussion about the disjoint proposal, then I *expect* you to have familiarized yourself with the prerequisite background material about the restrict qualifier, how it’s used by a compiler, and why people want such a mechanism to be available.

I know some of it, but not all. That is one reason to summarize it - to make sure people who THINK they understand enough to comment actually do. Also it ensures that if there is something wrong with your understanding other experts can point that out.

> That doesn't sound useful or semantically correct. I mean, aren't any two local variables always semantically disjoint?

Yes, that is true. Ideally, every variable would be implicitly disjoint unless decorated with some kind of “alias” qualifier (which would have implicit-add semantics like const and volatile) telling the compiler that the variable’s storage could possibly be accessed through multiple references. But we can’t implement that because it would break backward compatibility rather spectacularly. If anything at all is to be implemented, then it has to be the implicit-subtract “disjoint” qualifier because it does not change the meaning of any existing code. An object is formally less-qualified when it has the disjoint qualifier.

> That's not my definition of "type safety." To me, "type safety" means that the compiler should

> (A) prevent having two `disjoint` pointers to the same storage, or at least make it easier to write static-analyzer passes to detect that issue

That is “alias safety”, not “type safety”. Type safety means that you can’t accidently pass a pointer to a non-disjoint object to a function expecting a disjoint object. The restrict qualifier provides no such safety (and cannot be made to), but the disjoint qualifier does. Static analyzers are certainly able to detect cases in which a program creates two pointers to the same disjoint storage (which by itself is not dangerous) and uses them in a suspicious manner (perhaps by passing both as different parameters to the same function)

I don't see how disjoint in the language does this.

For a simple context yes, but as has already been shown with restrict compilers find the simple context and warn correctly. However but in any large projects it is likely that there are many different teams, and many different translations units. I'm not worried about the simple cases as was already shown the compiler warns about them. I'm worried about cases where multiple teams who don't understand each others' code are passing things between them. That is on a class team A calls setDisjoinValueOne, and team B calls setDisjointValueTwo - and then team C calls a function in that class that requires the two class variables be disjoint. It seems to me that the above is easy to do while following all your rules.

If it is only about discipline, why do you need this need to be in C++? Why can't you use a form of Hungarian notation, all disjoint variables prefixed with dj_ (or whatever), and then writing a tool to verify the rules are followed. What does putting this in the language solve that the above doesn't?

If you want me to support this in the language, then I want some assurance that following the rules (never casting them away) will result in something better than what we have. Right now it seems to me that nothing useful happens since the complex cases where this can happen can still be reached following all your rules, while the compiler can catch the simple cases without them.

I can see two solutions to this

First you could define a set of rules such that the compiler can enforce them to ensure that no matter how many different contexts I cross in my large codebase there is an error when someone introduces a real alias bug that wouldn't be there without the rules. (assuming no disjoint_cast is used - I agree you need). If there is any situation where your rules won't catch a real alias problem then document them so we can debate if we are adding a false sense of security or not.

Alternatively, write a sanitizer that finds alias issues at runtime and show how adding the disjoint allows the compiler to insert extra annotations that result in more useful debug messages on error. This annotation must be expensive enough that you wouldn't want the compiler to insert it for all variables. I will also acccept showing the compiler without alias sanitizer will do more optimizations counts (this is probably easier, but then I expect a "modern C++" type book to list the rules of when disjoint is worth adding)

(B) permit seamlessly converting non-`disjoint` pointers to `disjoint` pointers, in the case that they are in fact disjoint

No, no, no! You’re completely missing the point of the type safety here and the need for the reversed implicit-subtract semantics. The ability to seamlessly convert non-disjoint to disjoint is exactly why the noalias qualifier was rejected so harshly way back in the 80s:

https://www.lysator.liu.se/c/dmr-on-noalias.html

From: Arthur O'Dwyer <arthur.j.odwyer_at_[hidden]>

Sent: Tuesday, September 22, 2020 1:20 PM

To: Std-Proposals <std-proposals_at_[hidden]>

Cc: Eric Lengyel <lengyel_at_[hidden]>

Subject: Re: [std-proposals] RFC: disjoint qualifier

On Tue, Sep 22, 2020 at 4:06 PM Eric Lengyel via Std-Proposals <std-proposals_at_[hidden] <mailto:std-proposals_at_[hidden]> > wrote:

> I think you have done all the wording before you've come up with any use-cases, and I think that's a problem

The use cases are exactly the same as they are for restrict in C, which I assume you’re familiar with.

Don't assume. It can't help, but (if you assume wrong) it might hurt.

> void foo(disjoint int *r, disjoint int *s);

> int a, b;

> foo(&a, &b);

> you're telling me that that wouldn't compile?

Correct, that would not compile. You would need to declare a and b like this instead:

disjoint int a, b;

That doesn't sound useful or semantically correct. I mean, aren't any two local variables always semantically disjoint? How should the programmer decide which variables to declare using the `disjoint` keyword and which ones to leave alone?

> disjoint int *p1 = ...;

> auto p2 = p1;

> static_assert(std::same_as<decltype(p1), decltype(p2)>); // ???

> If p2's type is the same as p1's, then they're both `disjoint int *` — and so you have two `disjoint int *` objects that point to the same place.

Yes, p1 and p2 both have type ‘disjoint int *’, and they both point to the same object. It is the programmer’s responsibility to use such pointers correctly, exactly the same as it would be with restrict, but with the added type safety that disjoint allows.

It's the programmer's responsibility to track which `disjoint` variables are actually disjoint?

So the keyword provides no helpful semantics?

That's not my definition of "type safety." To me, "type safety" means that the compiler should

(A) prevent having two `disjoint` pointers to the same storage, or at least make it easier to write static-analyzer passes to detect that issue

(B) permit seamlessly converting non-`disjoint` pointers to `disjoint` pointers, in the case that they are in fact disjoint

Your proposal does neither of these things, so, basically, it seems like it doesn't do anything.

Use-cases, with sample code, would help.

–Arthur

-- 
Std-Proposals mailing list
Std-Proposals_at_[hidden] <mailto:Std-Proposals_at_[hidden]> 
https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals

Received on 2020-09-23 16:46:25