C++ Logo

sg20

Advanced search

Re: [SG20] A draft paper to fix the range-based for loop to make it teachable

From: Nicolai Josuttis <nico_at_[hidden]>
Date: Tue, 10 Nov 2020 16:50:10 +0100
Thanks Arthur,
see below.

Am 10.11.2020 um 15:51 schrieb Arthur O'Dwyer:
> On Tue, Nov 10, 2020 at 4:25 AM Nicolai Josuttis via SG20
> <sg20_at_[hidden] <mailto:sg20_at_[hidden]>> wrote:
>
> Hi,
>
> for years, I have significant problems to teach the range-based for
> loop, because of the bug that it has undefined behavior when iterating
> over a reference to a temporary.
> Thus, I
> - either have to tell beginners significant constraints in using it
> - or have to teach all the details why it is broken
> including references, lifetime extension, universal references
>
> Attached is a draft paper to fix the problem.
> I welcome all feedback and if you want to be a co-author
> (the more support the paper has the better).
>
>
> (1) I approve of the proposed change, which IIUC is simply to move the
> destruction of the range object from its current position above the loop
> to down below the loop. It sounds like the goal is to make `for (auto&&
> elt : range) { x }` work basically the same as `ranges::for_each(range,
> []() { x })`, as shown in the lambda example on page 5. This is a good goal.
>
interesting point. thanks.

> (2) I strongly disapprove of the sensationalistic title. This one change
> will not move the needle even a tiny bit on "healing C++"; the proposal
> addresses a corner case of a corner case. Range-based `for` is already
> quite teachable — or rather, it doesn't even need to be taught to most
> students, because they're already coming from languages that have the
> same `for` loop. People are also already aware that loops have pitfalls;
> the most common one is "modifying the container inside the loop," which
> usually leads to UB, and isn't addressed at all in your paper. (Your
> paper shouldn't address it. But that's just one example of how your
> paper doesn't "heal C++" and can't even be said to fix more than one of
> the minor existing problems around the `for` loop.)
> My suggested title is "Increased lifetime for `for` loops' range objects."
>
ok, thanks, got it
(I meant it as one paper of a series of papers to heal...)

> (3) FWIW, I consider the motivating problem not a problem with for at
> all. for is easy to teach. The culprit here is "view types" (I've also
> called them "parameter-only types") — types which pretend to "have" an
> iterable range of elements, without actually participating in the
> ownership of that range, so that they can dangle if the backing storage
> is deallocated too early. C++20 Ranges makes this problem 10x worse, for
> sure. But your paper does a very good job of demonstrating that the
> problem is not confined to Ranges; you can get it via C++17 string_view
> or C++20 span as well, or even by chaining method calls as in `for
> (auto&& elt : foo().bar())`.
>
but in which sense is iterating over the elements of the first vector in
vector<vector<int>> a view?

> (3) On page 4 you say, "the API of ranges was significantly modified."
> Could you explain more (with a URL, or in a footnote) what you mean?
>
I thought I do by referring to
 https://cplusplus.github.io/EWG/ewg-active.html#120
and quoting the example from there:
Ranges initially wanmted to support (and did support) this:
> std::vector<int> vec;
> for (int val : vec | boost::adaptors::reversed
> | boost::adaptors::uniqued) {
> // Do stuff with val
> }

But this code falls exactly in the category of the problem.

> (4) On page 6, does "one of the authors" mean "one of the Facebook
> people"? or "Nico Josuttis"?
>
One of the 3 authors, not Nico ;-)
(let me ask him for permission to quote his name).

> (5) /rather have less than more security risks/have fewer security
> risks, not more/
> (6) /significant more/significantly more/
>
thanks

> (7) Top of page 10: "Are there other places in the language that have
> similar problems?" Yes, obviously, literally anywhere that a
> dangling-reference bug might occur — that'd be a "similar problem." C++
> is full of dangling references. For every place someone writes
> for (auto elt : foo().bar())
> you'll find two places where someone accidentally writes
> const auto& local = foo().bar();
> The latter is /*easier to fix*/ once the programmer has been notified of
> the bug; but I've also seen it /*happen */more often. (Although I admit
> that I probably see it more often simply because it's easier to spot
> visually.)
>
OK, I agree, "similar" is probably wrong.
The special quality is that the usa of a reference is not visible to the
programmer. That we don't have anywhere else.
So a key warning signal (and something I first have to teach) is missing.


> (8) Your example on page 10 seems mistaken, or else it needs further
> elaboration. You say that a loop like
> for (auto elem : syncObj->getValues())
> has a deadlock, but changing it to
> for (const auto& elem : syncObj->getValues())
> somehow removes the deadlock? I don't think the ref-ness of `elem` can
> possibly affect the outcome. If I'm right, and the ref-ness doesn't
> matter, then you should pick one type for `elem` and stick with it
> throughout the whole example.
>
yes, thanks, it would not fix, it would reveal the bug.

> (9) "a programmer still can avoid any overhead of the
> range-based for loop by using the optional initializer of the loop" —
> Initially, I thought you had this backwards. I had to work it out by
> hand (see below) to see what you were alluding to. Please add these
> examples to the paper, around this point. Here are the examples:
> IIUC, in C++20 today,
> struct X { X b(); X c(); }; X a();
> for (auto elt : a().b().c()) { use(elt); }
> compiles into the moral equivalent of
> X aa = a(); X bb = aa.b(); X cc = bb.c();
> destroy(bb); destroy(aa);
> for (auto elt : cc) { use(elt); }
> destroy(cc);
> Your proposal says that it /should /compile into the moral equivalent of
> X aa = a(); X bb = aa.b(); X cc = bb.c();
> for (auto elt : cc) { use(elt); }
> destroy(cc); destroy(bb); destroy(aa);
> Then, if we look at C++20 ranged-for-with-initializer, today
> for (auto bb = a().b(); auto elt : bb.c()) { use(elt); }
> compiles into the moral equivalent of
> X aa = a(); X bb = aa.b();
> destroy(aa);
> X cc = bb.c();
> for (auto elt : cc) { use(elt); }
> destroy(cc);
> destroy(bb);
> Your proposal doesn't propose any change to these semantics. Notably,
> you are not proposing that the lifetime of "X aa" be increased all the
> way to the bottom of the loop. "X aa" is just a temporary object used
> briefly in the initializing expression of `bb`, and so it should still
> get destroyed quickly at the end of the full-expression `a().b()`.
>
thanks

> (10) I would like to see more explicit discussion of why you're not
> going to come back in three years complaining about how
> for (auto bb = a().b(); auto elt : bb.c()) { use(elt); }
> has a dangling reference to the result of `a()`. Why do you consider
> this dangling reference OK, whereas the one you're fixing has been some
> kind of mortal blow to C++'s teachability?
>
Well, first, so far I NEVER used or taught this new loop at all.
Second I am confused.
As bb is not a reference, where is the problem here?


> (11) Nitpick: The proposed wording, top of page 12, needs to mention
> `goto` as well as goto-labels. This proposal should have no effect on
> the program's ability to transfer control into and out of a for-loop's
> body. But of course the wording is extremely handwavey at the moment; I
> don't like any of these options; CWG will have to weigh in.
>
thanks

> (12) I very much appreciate the way you avoid using the phrase "lifetime
> extension" to refer to this new thing. In my proposed title for your
> paper, I deliberately called it "Increased Lifetime," instead of
> "Extended Lifetime," because the mechanism you're proposing here is
> significantly different from C++98's "lifetime extension," and I think
> it's really important not to confuse the two mechanisms. I also kind of
> like the phrases "deferred destruction" and/or "delayed destruction [of
> range objects]."
>
I don't introduce a new language rule at all.
All I say is that the range-based-for-loop operates as a blacvk box
dealing with the expression after the : like with a function argument.

> (13) If you haven't read it lately, you might want to (re-)read this
> blog post of mine:
> https://quuxplusone.github.io/blog/2020/03/04/field-report-on-lifetime-extension/
> <https://quuxplusone.github.io/blog/2020/03/04/field-report-on-lifetime-extension/>
> Specifically, "False Positive #1."
> IIUC, your proposal would effectively cause `for` loops to be lowered
> into a form that does not rely on lifetime extension(*), which is awesome.
> (* — Except if the "element" variable itself is lifetime-extended. This
> is more or less what happens in "True Positive #3" in my blog post.)
>
thanks

> Modulo my strong objection to the current title in point (2), I would
> gladly sign on to this paper, if you wanted.
>
I will remove "heal C++" and deal with other comments.
Thanks

> my $.02,
> –Arthur

-- 
Nicolai M. Josuttis
www.josuttis.de
+49 (0)531 / 129 88 86
+49 (0)700 / JOSUTTIS
Books:
 C++: http://cppstdlib.com, http://cppstd17.com,
      http://tmplbook.com, http://cppmove.com
 SOA: http://soa-in-practice.com

Received on 2020-11-10 09:50:17