C++ Logo

SG12

Advanced search

Subject: Re: Feedback on P2234R0 "Consider a UB and IF-NDR Audit"
From: Scott Schurr (s.scott.schurr_at_[hidden])
Date: 2020-10-21 11:15:27


Hi Jonathan,

On Wed, Oct 21, 2020 at 6:10 AM Jonathan Wakely <cxx_at_[hidden]> wrote:

> Re http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p2234r0.pdf
>
> Thanks for working on this, Scott. I approve of the overall goal, but have
> some comments on specific parts.
>

Thanks for your careful reading of the paper. I appreciate your feedback
on the suggested approaches. I believe every one of them. I'm happy to
fix those examples which are fixable and remove any suggested approaches
that are simply not workable. Would you be willing to help with that?

And, yes, I completely agree that participation and review by optimizer
writers is critical to the successful outcome of the audit. I'm delighted
that you seem to have a sense for who these folks would be for gcc. Of
course we'll need clang and Visual Studio optimizer writers involved with
the review as well. Yup, that will certainly influence how quickly the
work can proceed.

I'm quite happy that you approve of the overall goal. Motivating the audit
(or something like it) was really the point of the paper.

Scott Schurr
S.Scott.Schurr_at_[hidden]

> Re User Specializations of Certain Templates in Namespace std
>
> Regarding a new attribute or compiler feature to disallow specializing
> templates defined in namespace std. Sometimes the std::lib implementation
> wants (or needs) to specialize a template, even though users can't. Either
> the attribute needs to have a way to disable it on certain specializations,
> or the compiler needs to be taught about every special case to know if a
> given specialization is OK. For the former case, users can just cheat and
> use the same mechanism to disable the decoration on their own
> specializations, although I don't think it's reasonable to defend against
> sufficiently motivated "attackers" so maybe that doesn't matter. The latter
> case (teaching the compiler about every special case) will not work if e.g.
> libstdc++ adds a new specialization of a template in std, and somebody
> tries to compile it with a version of Clang or EDG released last year,
> which doesn't yet know about the extra specialization it needs to allow. I
> think in practice any attempt to **require** such specializations to be
> ill-formed is hard to get right.
>
> Unless I missed it, the paper only talks about specializations for the
> handful of templates where **any** program-defined specialization is
> forbidden. It's also undefined to specialize something like
> std::allocator<int> or std::basic_string<signed char>, only specializations
> that depend on a program-defined type are allowed (C++20 16.4.5.2.1
> [namespace.std] p2). That's another huge class of UB (every template in the
> std::lib is affected!) which I think is also impractical to make
> ill-formed. It might seem that a compiler could issue a diagnostic for
> violations of the rule by requiring every specialization of a template in
> std to depend on a type defined outside std. But there are types which are
> part of the implementation which are defined outside std (e.g. ::mbstate_t
> or __gnu_cxx::__some_internal_detail) so the compiler would need to be
> taught about every one of those as well.
>
> Re Flowing Off the End of a Non-void Returning Function
>
> Requiring the addition of a call to abort() or similar doesn't only affect
> code ordering. It can increase the code size if the compiler can't prove
> the abort is unreachable. It's not just some extra text in the source code,
> it's extra instructions that aren't needed, which some users of C++ will
> say is unacceptable. To extend the mask analogy, it's not necessary for
> athletes to wear a mask while competing in a 100m sprint. Masks are good,
> but requiring them in all situations without exception causes other
> problems. So I think a new decoration is necessary instead of requiring a
> call to a [[noreturn]] function.
>
> Some compilers provide extensions to inform the compiler that a certain
> code path (such as flowing off the end of a non-void function) is
> unreachable, e.g __builtin_unreachable. That could be standardized. Or the
> [[noreturn]] attribute could be reused. Instead of the more general
> __builtin_unreachable (which can appear anywhere in a function) the
> [[noreturn]] attribute could be allowed on an empty statement at the end of
> a function, to indicate that omitting the return statement is intentional
> because it isn't reachable. It doesn't remove the potential for undefined
> behaviour, because the user who adds that decoration could be wrong, and if
> control flow does actually get there it's still undefined. But the common
> case of accidentally forgetting the return statement or failing to handle a
> conditional case could be turned into a required diagnostic. If the user
> explicitly adds [[unreachable]] or [[noreturn]] we've reduced the incidence
> of UB to the cases where they're explicitly wrong, not just accidentally
> careless.
>
> Re memcpy
>
> The paper says "It would probably be undesirable to have different
> implementations for ::memcpy and std::memcpy." More than undesirable, it
> would be non-conforming. The standard requires that &::memcpy ==
> &std::memcpy i.e. there is only one memcpy. I am very strongly opposed to
> removing that guarantee.
>
> Re Infinite Loops
>
> Any given implementation is allowed to decide not to optimize away
> infinite loops, if they serve a purpose in that environment. That would be
> a QoI matter, so I don't think intermediate programmers on embedded systems
> is a good motivation to prevent all optimisers from relying on the existing
> rule (although maybe it could be made dependent on freestanding/hosted
> implementations).
>
> This whole section makes the solution sound very simple, but I think there
> needs to be very careful consideration of all the ways that current
> implementations make use of the existing rule.
>
>
> Re Refine Proscription Wording Within the Standard
>
> Although I'm not opposed to the rephrasing of [namespace.std] p6, it
> seems like a poor example to choose. What is the practical benefit of
> postponing the unspecified behaviour "until the pointer is actually used"?
> How many programmers are taking the address of std::lib functions, but not
> using the result? Why do we care? Why do we want to allow it? In all
> practical cases, if you do $BADTHING then the program's behaviour is
> unspecified. Rephrasing it doesn't seem to change anything.
>
> Similarly in the section talking about "unspeciformed", you say "As Joshua
> says, we would make the world a safer place". Well as already stated, in
> practice you never get a chicken. It
>
> The suggested change for addressable functions and the introduction of a
> new term of art would remove hypothetical danger that doesn't affect
> anybody, intermediate programmer or not.
>
>
> re A Process to Identify Useful Changes to the Standard
>
> An audit of how optimising compilers actually rely on some of the rules is
> essential before changing them. We don't necessarily have all the right
> people in WG21, e.g. for GCC the "middle-end" experts who work on the
> optimisers are not involved with WG21, and unlikely to be part of the
> "small team" assembled for step 3b. I don't think that needs to be said in
> the paper, but just FYI there could be delays in the process due to
> soliciting external feedback.
>
> Anyway, sign me up for team 3c, but I'll be surprised if UB in the library
> can be significantly reduced beyond what was already done by the following
> (and others I can't find right now). LWG already went to A LOT of effort
> making unnecessary UB ill-formed.
>
> Mandating the Standard Library: Clause 16 - Language support library <
> https://wg21.link/p1458r1>
> Mandating the Standard Library: Clause 18 - Diagnostics library <
> https://wg21.link/p1459r1>
> Mandating the Standard Library: Clause 20 - Strings library <
> https://wg21.link/p1462r1>
> Mandating the Standard Library: Clause 21 - Containers library <
> https://wg21.link/p1463r1>
> Mandating the Standard Library: Clause 22 - Iterators library <
> https://wg21.link/p1464r1>
> Mandating the Standard Library: Clause 25 - Algorithms library <
> https://wg21.link/p1718r2>
> Mandating the Standard Library: Clause 26 - Numerics library <
> https://wg21.link/p1719r2>
> Mandating the Standard Library: Clause 27 - Time library <
> https://wg21.link/p1686r2>
> Mandating the Standard Library: Clause 28 - Localization library <
> https://wg21.link/p1720r2>
> Mandating the Standard Library: Clause 29 - Input/Output library <
> https://wg21.link/p1721r2>
> Mandating the Standard Library: Clause 30 - Regular Expression library <
> https://wg21.link/p1722r2>
> Mandating the Standard Library: Clause 31 - Atomics library <
> https://wg21.link/p1723r2>
> Mandating the Standard Library: Clause 32 - Thread support library <
> https://wg21.link/p1622r3>
>



SG12 list run by sg12-owner@lists.isocpp.org