C++ Logo

std-proposals

Advanced search

The C++ 17&&20 object model is broken, and concepts would reveal it for the world to see

From: Omer Rosler <omer.rosler_at_[hidden]>
Date: Sat, 5 Oct 2019 02:52:46 +0300
This is meant to be a discussion before I write a formal proposal.
My presentation is not very clean, so please forgive me.
I wrote this in a proposal style already, but it is not final in any way.
Please tell me what do you think.
Is my estimation of the importance of this problem realistic?
Is my solution tractable?
Does it break any C++ 14 code not already broken in 17/20?

*The problem*

Writing a generic library that need to create objects of the types it uses,
is impossible to implement correctly in C++ 20 without havy
type-based-metaprogramming.

Why? because concepts know nothing of storage, and by extension, on the
correct value semantics usage of these objects (should I copy, move,
forward, allocate them on the heap?).
This means that in-order to extract the value semantics (and dispatch
accordingly) requires heavy SFINAE magic.

This exposes a hole in the object model introduced in C++ 17.

We'll consider a single toy example that illustrates all of the problems
this causes (by the way, the `std::error` type proposed as part of the
"Herbceptions" proosal suffers from the same problem).

If you can write the example below correctly, then you must understand the
problem.

If you want to see the core issue, skip the following sub-section.
If you want to understand why it will be an issue of major importance after
concepts ship, read the example code first.

Toy example: lazy_addable_wrapper library

Look at a toy example of a concepts-based generic library that caches `+=`
operations on a value and applies all when requested.

Consider this library as abuilding block of two other generic libraries:
1. A "safe array smart-pointer" library that does not enable out of bounds
access by remembering the original arrays used to create the pointers.
2. An expression template library that can convert `a += b += b` into `a +=
(2*b)` at compile time (consider the type used to be inefficient to move,
such as matrices).

```c++
template <Addable A>
struct lazy_addable_wrapper
{
     //special member functions skipped for breviety
    template <typename B = A>
    requires std::is_convertible_v<B, A>
    constexpr lazy_addable_wrapper(B &&b) : cached_additions{},
starting_value(std::forward<B>(b))
    {
    }
     template <Movable B = A>
    requires Addable<B>
    constexpr lazy_addable_wrapper & operator+=(B &&a)
    {
          cached_additions.emplace_back(std::move(a));
         return *this;

    }
     // we still need forwarding reference because we want to allow prvalue
references to non-movable types to choose this overload template<Copyable B
= A> requires Addable<B> && !requires Movable<B> constexpr
lazy_addable_wrapper& operator+=(B&& a) {
 cached_additions.push_back(std::forward(a)); return *this; }
    template <Movable B = A>
    requires Addable<B> constexpr lazy_addable_wrapper
&operator+=(lazy_addable_wrapper<B> &&other)
    {
          (*this)+=
std::forward<decltype(std::forward<lazy_addable_wrapper<B>>(other).starting_value)>(std::forward<lazy_addable_wrapper<B>>(other).starting_value);
           std::move(std::back_inserter(cached_additions),
 std::make_move_iterator(other.cached_additions.begin()),
 std::make_move_iterator(other.cached_additions.end()));
          return *this;

    }
     template <Movable B = A>
        requires Addable<B> && !Copyable<B>
    constexpr lazy_addable_wrapper & operator+=(const
lazy_addable_wrapper<B> &other) = delete;
     template <Movable B = A>
    requires Addable<B> && Copyable<B>
    constexpr lazy_addable_wrapper &operator+=(lazy_addable_wrapper<B>
&&other)
    {
           (*this)+=
std::forward<decltype(std::forward<lazy_addable_wrapper<B>>(other).starting_value)>(std::forward<lazy_addable_wrapper<B>>(other).starting_value);
           std::move(std::back_inserter(cached_additions),
 std::make_move_iterator(other.cached_additions.begin()),
 std::make_move_iterator(other.cached_additions.end()));
          return *this;

    }
     template <Copyable B = A>
        requires Addable<B> && !Movable<B>
    constexpr lazy_addable_wrapper & operator+=(const
lazy_addable_wrapper<B> &other)
    {
          (*this)+=
std::forward<decltype(std::forward<lazy_addable_wrapper<B>>(other).starting_value)>(std::forward<lazy_addable_wrapper<B>>(other).starting_value);
           std::move(std::back_inserter(cached_additions),
 other.cached_additions.begin(), other.cached_additions.end());
          return *this;

    }
    constexpr A&& operator*() &&
    {
          return std::accumulate(cached_additions.begin(),
cached_additions.end(), std::move(starting_value), [](auto &&lhs, auto
&&rhs) { return std::forward<decltype(lhs)>+=
std::forward<decltype(rhs)>(rhs); });

    }

private:
     std::vector<A> cached_additions;
     A starting_value;
};
//deduction guides skipped for breviety

static_assert(CrossTypeAddable<lazy_addable_wrapper<int[10]>, ptrdiff_t>,
"What?");
static_assert(CrossTypeAddable<
    lazy_addable_wrapper<
        lazy_addable_wrapper<int[10]>>,
    lazy_addable_wrapper<ptrdiff_t>>, "Wierd, and problematic for
multi-dimensional arrays");
```

I counted at least 4 bugs that cause these static assertions to fire (there
are probably more).

Two of them are fixable "inline".
One requires to make specializations to the entire templated class.
Can you list all of them?

The second static assertion would fire, and it is inheritly unfixable in
C++ 20.

All of these bugs come from the case of non-copyable non-movable types.
They still bind to the references (as they are prvalues), but then, the
function body fails to compile.

The core of the problem
One of the most fundamental invariants of C++'s value semantics:

*User defined types are equivalent to builtin types*

This is true when we consider glvalues but not for *prvalues*.

With C++ 17's guaranteed copy elision, elision became part of the object
model, but users can't possibly write code that depends on it.
For example:

```c++
int arr[10][12];
lazy_addable_wrapper arr_ptr(arr[3]);
wrapper+=3+= lazy_addable_wrapper( lazy_addable_wrapper(arr[7]) += (-5));
```

Where all required constructors, conversions functions and deduction guides
are implemented correctly - this would fail to compile because we are
binding an rvalue reference to a const object (the array reference is
decays into a const pointer).

If we used the builtin array type, this would work due to array-to-pointer
decaying.

Why should the commitee solve this problem?
*1. One of the long term goals of the committee is trying to make generic
code easier to write *
 A new defect in the C++ object model was introduced in C++ 17 with
guaranteed copy elision.
This problem was not pronounced yet, because it only affects generic code.
- After concepts becoming wide-spread with the release of C++ 20, generic
code would be more common.
- Binding prvalues to universal references would be thought as
almost-always-a-bug just like "slicing".
*2. Generic C++ 17 wrapper libraries transitioning to concepts would cause
a behaviour change in user code.*
Consider an existing complete wrapper library written in C++ 17.
It disallows prvalues, as it cannot treat them properly.
If the library wants to refactor to using concepts instead of ugly
metaprogramming, then the concept would also have to disallow NonCopyable
and NonMovable types.
This would mean the concepts won't be satisfied by these types.
But if the concept is used elsewhere in a prvalue manner (which is the case
for the toy example - the matrices use case), then this use case is broken,
or we fragment our concept hieriechy.

If the concept is a standard library one, or in a different library (which
will probably become common in a few years from now) then this requires
forking the concepts library to maintain backward compatibility with this
use case.

*3. This is a huge broken invariant of the object model*

*4. The proposal aims to add prvalues references, which is one step closer
to an RAII based object model and guaranteed Ultimate Copy Elision.*
The unique property of prvalues in the object model is that they are "the
unique way to refer to their result object in the entire program" (which is
the core reason for delaying it's creation to when it is actually used).
This is really a `unique_ptr` to the storage location of the result object,
where the destructor of the unique_ptr is what starts the lifetime of the
result object.
Extending this analogy with further language extensions (`restrict`
qualifier) would make the language more efficient by default.

*5. This proposal resolves some of the objections to Herb-ceptions
exception propogation zero-overhead guarantee*
Throwing an lvalue in the Herb-ception proposal is not guaranteed to be
zero overhead for the same reason NRVO is not guaranteed by the language in
any way.
Throwing a prvalue is OK, and catching by `prvalue` reference does not
materialize it.
This is achievable with the correct implementation of `std::error` type
that treats prvalues differentely then other types.
In fact, the response to this proposal is

Core proposal - *first class **prvalue references.*
Strawman syntax (not essential at all, please no bikesheding): `T !&&! t`
is a prvalue reference of type `T` with alias name `t`.

`t` does not behave like a regular value, as taking it's address is
forbidden, or assigning it (because it is not a glvalue, but a prvalue) and
if you want a materialization conversion, you need to case away `pr`-ness
via `std::move` or a hypothetical `std::materialize`.

Forwarding references would also bind to prvalue references.

Placement new expressions taking `prvalues` reference arguments is the only
way of the core language to perform a materialization conversion on a
`prvalue` reference.

`prvalue` references behave like aliases to objects, and they need to be
the unique owners of the lifetime of their result object (otherwise they
can't control materialization).

This means that they behave like "`unique_ptr` to the prvalue's lifetime,
where the destructor materializes them".

To fix the toy example we would need to add
```c++
namespace std
{
template<typename T>
constexpr T && materialize(T !&&! t)
{
    aligned_storage_t<sizeof(T), alignof(T)> storage;
    T &!!& unique_storage_ref = std::move(new (&storage) T(t));
    return unique_storage_ref; //guaranteed copy/move elision as we return
a prvalue, and so the above line will be performed on the caller stack frame
}
}

template<Addable A>
struct lazy_addable_wrapper
{
    //...
    A operator*() !&&! //cv qualifier
    {
        return {*std::materialize(*this)};
    }
    template<MovableOrCopyable B = A>
    A operator*() && //cv qualifier
    {
        //same as before
    }
     template<Copyable B = A>
     A operator*() & = delete;
```

Can you spot more new bugs in this code?

Extension Proposal
Just as move semantics needed to integrate with the language, this new type
of reference needs a deep integration as well.
I have an outline, and it also solves the problem of transferring this
value semantic information, but I won't bother with publishing it in one
go (again...) as it would cause this to be too large.
In short, for this `prvalue` reference to integrate well, we need a
mechanism to transfer the ownership of a `prvalue` just like a return
statement behaves like a lifetime extension of a prvalue.

In fact, the lifetime extension rules could be reframed in terms of prvalue
references (and that would convert many dangling reference bugs into
compile time errors).

This mechanism must be part of the core object model, and it should behave
similar to the semantics of the `restrict` qualifier in C.

Summary

Handling of prvalues by generic code is a major hole in the object model
that needs to be fixed.
The simplest solution is the most direct one - add a reference
corresponding to this value category.
This solution is benefitial especially if integrated correctly in the
language (safer and more efficient language, makes vocabulary types truly
zero-overhead).

Received on 2019-10-04 18:52:10