C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Pre-Draft Proposal for elaborate extension points

From: William Linkmeyer <wlink10_at_[hidden]>
Date: Mon, 19 Jun 2023 00:57:27 -0400
I think this is useful. I’m interpreting the lack of comments as a sign that there’s nothing obviously wrong. 

WL

On Jun 17, 2023, at 8:32 AM, Михаил Найденов via Std-Proposals <std-proposals_at_[hidden]> wrote:


Here is a raw prototype with what is possible using current techniques.

As noted, the class concept is basically a template, where the `this` pointer type is the template argument. 
This can be expressed almost completely with C++20, because, of explicit object types:

template<class I>
struct Shape {
  using coords_t = int;
  static const bool constant = false;
  static bool is_intersection(const I&, const I&) { std::cout << "Shape::intersect, constant is " << I::constant; return false; }

  auto width(this const I&) { return I::coords_t(0); }
  auto height(this const I&) { return I::coords_t(0); }

  auto area(this const I& t) { return t.width() * t.height(); } //< will call I::width and height, not Shape::width and height
};


`I` stands for Implementation.

As you can see, little has changed from a normal C++20 class. Probably the biggest workaround needed is with "associated types". 
In particular T::coords_t will not work as a return type (will fail to compile upon instantiation) and we have to do a cast inside the function instead.

We implement this "interface" by subclassing

struct S : Shape<S> {
  using coords_t = double;
  coords_t w;
};

We can use S as Shape now. 

S s{{}, 12.25};
std::cout << s.area();

Of course, this will return 0, but area will be of the correct type - double. 
We have overridden the associated type, as far as usage is concerned! 

Adding member functions completes the implementation 

struct S : Shape<S> {
  using coords_t = double;
  coords_t w;

  auto width() const { return w; }
  auto height() const { return w; }
};

S s{{}, 12.25};
std::cout << s.area(); //< prints 150

Interestingly, implementation outside the class is much the same. 
However we need to wrap the original type, no way around this without proper compiler support.

struct S {
  using coords_t = double;
  coords_t w;
};

struct _implShapeForS : Shape<_implShapeForS>
{
  S& s;
  _implShapeForS(S& s) : s(s) {}
};

This alone gives us "default implementation" for is_itersection for example (by implicitly creating  _implShapeForS  from S)

S a, b;

if(_implShapeForS::is_itersection(a, b)) { //< is_itersection DEFAULT
  ...
}

We can "override" it:

struct _implShapeForS : Shape<_implShapeForS>
{
  using Impl =  _implShapeForS;

  S& s;
  _implShapeForS(S& s) : s(s) {}

  static bool is_intersection(const Impl&, const Impl&) { ... }
};

S a, b;

if(_implShapeForS::is_itersection(a, b)) { //< is_itersection IMPLEMENTATION
...

And we can add members:

struct S {
  using coords_t = double;
  coords_t w;
};

struct _implShapeForS : Shape<_implShapeForS>
{
  using Impl =  _implShapeForS;

  S& s;
  _implShapeForS(S& s) : s(s) {}

  static bool is_intersection(const Impl&, const Impl&) { ... }

  auto width() const { return s.w; }
  auto height() const { return s.w; }
};

S s{12.25};

std::cout << _implShapeForS(s).area(); //< correctly prints 150

At this point, we have а complete implementation outside of class! 
We can add or remove any function, any type or constant. We can also call the default implementation, if wanted

struct _implShapeForS : Shape<_implShapeForS>
{
  ...
  auto area() const { return t.Shape<_implShapeForS>::area() / 2; }  //< implement in terms of DEFAULT
};

S s{12.25};

std::cout << _implShapeForS(s).area(); //< correctly prints 75

One thing we can't do, however, is to use implementations from S directly, the way we use the default implementation. 
Even if subclass S as well, this will only give us access to the static members, while creating a mess in more places then it is worth. 

Without compiler support, we are limited to wrapping/redirecting manually from the implementation to the class (from _implShapeForS to S),
by adding the needed types, functions and constants manually.

And of course, the other missing piece is the calling expressions. We need a way to transform calls from the concept class to the implementation:

s.Shape::area() must resolve to s._implShapeForS::area(), if present. 

Needless to say, _implShapeForS must also not require either wrapping or instantiation. This requirement has an overlap with "extension methods take 2", I wrote few weeks back.
Essentially, _implShapeForS acts as a hidden "extension class", which redirects the calls to the correct implementation - itself, class or default. 
It is accessed via the "base" concept class and the concrete class interface stays untouched at all times. 

Using this proptype it is evident how this approach is different from class specialization. Instead of specializing the base interface for a concrete type, we instantiate it with a middle-man, which is the broker b/w the base and the type.
This way the base is fully usable, as it is not replaced by another, more specialized class. It is the middle-man, the "implementation", that has to pick and choose from where the final code is coming from, including the base.
The implementation is not a (more specialized) replacement of the base.

Now the remaining work is to make the middle-man transparent to both the type and the base, so that calls (as well as static member and types!) start at the base, pass through the implementation and end up in the type (if needed).

The middle-man is invisible, the same way a vtable is invisible: A call to a base (function), passes through the vtable, transparently, ending up in the concrete type, or goes to the base. 
Here it is similar - Shape::somethingpasses through the implementation (_implShapeForStransparently, ending up in the concrete type, or goes to the base, or stops at the implementation.

Obviously, no function pointers are involved here at any point. It is all about calling the implementation (the middle man). It will always be implemented with the right answer.


On Sun, Jun 11, 2023 at 2:51 PM Михаил Найденов <mihailnajdenov_at_[hidden]> wrote:
Hello, here is an outline of a system, that servese to 
 - be a reliable and easy-to-use "extension point" mechanism
 - bridge the gap b/w virtual functions and constrained templates

(Spoiler, it is similar to Circle's impl, but there are differences, some of them significant)

First let's step back and take a look at one of the most widespread ways to write abstract code - the virtual class. 

Given 
void func(Car s);
to make the above work on any Car-like, one must copy the interface of Car into a new class, with a new name, and make some or all functions virtual. 
Then the original Car, as well as all users, must subclass the new class, let's call it Vehicle, and also change the param of func to take Vehicle by reference. 
Last, but not least, we are forced into member-style syntax.

This is both easy and hard, depending on the view point and is also powerful and limiting, depending on the viewpoint. 
One thing is certain, for better or worse, this model is a monumental success overall (not just C++, but overall).  This tells us, it is not "hard", compared to its benefits and is also powerful enough. 

Interestingly, stepping inside templates, we gain "easiness" - no more interfaces!, and also power - we can abstract associated types, we are not limited by call expressions style, etc. 
Or so it seemed. In practice, after decades of experience, it turned out, to write a correct algorithm template, where the author both gives as much flexibility to the user while at the same time has guarantees of a reliable implementation, 
is way, waaaaaay harder than using abstract classes. Not only that, the original algorithm will be buried inside all the extra machinery needed for the correct templatization. 

Ideally, a template based generic function should need similar effort, compared to abstract classes. 
Only then, it can be a real alternative to it. If not, correct templates will stay with the experts, the rest of the world will continue to use naive templates and abstract classes as it is the case today

How can we improve that? First thing is to build upon what is known and understandable, then provide a simple step to gain more power on top of that. 
This is the model that made abstract classes so successful! 
From the well known class (the understandable), make a simple step (add the keyword virtual) and gain considerable power.

This is our goal. Have the same effort, the same simplicity as with virtual classes. 

Back to our "customization point" case, given what we know and understand - the class: 

struct Printer {
  using Canvas = Screen;
  static void connect(Printer&, Device&);
  void print(string s) const;
}

Add the keyword concept 

struct concept Printer {
  ...
}

Gain power right away:

In a dependent context, 
 - you can have different Printers, each with a different Canvas (associated types),
 - have different static functions and of course different print() implementations. 
 - classes do not need to subclass Printer (but they can)
Most of these are not possible with abstract base classes, yet the effort was similar, so there is an incentive! 

On the algorithm definition side the effort is similar to abstract classes as well:

void func(Printer auto& p) { //< printer is not a concrete class any more
  // use Printer 
}

As you can see, Printer is (also) a concept. 
It will act similarly (but not the same!) to a compiler-generated expression-based concept.

There are two important differences, which are the reason why class concepts do not replace regular concepts, but amend them.

The first thing is not required, but highly desirable. 
Upon instantiation, the compiler should check class concept, so that to know what member is what, ending the need of typename and template inside function bodies! Again, not required, but an embersament without it.
Full definition checking is not proposed.

The second thing is required and is one of the things that makes class concepts suitable as extension points.
All member names (static and instance) belong to the concept class. The type can NOT provide an overload to the given name. (The concrete class is not the master here, it is the class concept.)

void func(Printer auto& p) {
  p.print("hi"); //< OK
  p.print(12); //< ERROR, print is only what Printer says it is!!! 
  p.something_else(); //< OK, we don't care
}

The reason for the above change is that class concepts are nominal requirements - a must  for any extension point implementation
They complement regular concepts (structural requirements). In fact, the following is perfectly valid:

template<class T>
concept MovablePrinter = Printer && std::movable<T>; 
  
As said, a class concept is a kind-of-concept, if it is not already obvious by the name, with the template argument being implicit.

Let's look at the other half of a class concept, the "class" part. 
It provides something regular concepts do not have - a place to have an implementation!

 struct concept Printer {
  using Canvas = Screen; //< assoc type (default) 
  ...
  void print(string s) const { std::cout << s << "\n;  } //< print implementation (default)
}

This way, a class implementing the interface does not need to provide all the implementations (like it is the case with template specialization). It can, however, still call the default implementation.

A class concept is a limited class. 
In particular, its `this` pointer is of unknown type, effectively a template param. 
It also can not have instance members, because, well, there are no instances of it - it can not be instantiated directly. The only way to instantiate it is as a base implementation for another, concrete class, similar to purely abstract base class.

This can be normal subclassing:

struct concept Shape {
  using coords_t = int;
  static bool is_interesection(const Shape&, const Shape&) {  //< Shape here is a stand-in for the implementing type. 
    /*check bounding box intersection*/
  }
  coords_t    width() const;
  coords_t    height() const;
  int area() const { return width() * height(); }
  ...
};

// Subclass 

class RectF : Shape {
  ...
  using coords_t = float;
   coords_t width() const { this->getWidth(); }
  coords_t height() const { this->getWidth(); }
  ...
};

Even at this limited usage, we gain additional power, compared to standard subclassing: 

void func(Shape auto sh) {
  // use Shape

  // in particular Shape::is_interesection will be called with the subclassed type
  // in contrast to normal subclassing 
}

RectF r;
func(r); //< is_interesection is called with RectF

However, the implementation can also be external!

Existing class:

struct RectF {
  ...
};

Add shape confirmation:

RectF : Shape {
  ...
  coords_t width() const { this->getWidth(); }
  coords_t height() const { this->getWidth(); }
};

The use is unchanged: 

void func(Shape auto sh) {
  // use Shape: Assoc type, static and instance methods
  // They will resolve to either 
  //  - the class concept default implementation
  //  - or the concrete class external implementation
  //  - or the concrete class members, in case of an empty external implementation and matching member names.
}

This might be similar to template specialization, with a different syntax, but is not.

Concretely, implementers can call the default implementation

struct Triangle {
  ...
};

Triangle : Shape {
  ...
  int area() const { return Shape::area() / 2; } //< call default impl
};

This is because a class, a class concept and an implementation are three different entities. In a specialization we have two. 

Now the question is how are the calls found.

This question has different answers, depending on the context. 

In a standard context:.

int main() {
  Triangle t,b;
  t.area();                  //< calls MEMBER function
  t.Triangle::area();        //< same
  t.Shape::area();           //< calls IMPLEMENTATION or MEMBER or DEFAULT 

  Triangle::is_interesection(t, b);  //< calls MEMBER (static)
  Shape::is_interesection(t, b);     //< calls IMPLEMENTATION or MEMBER or DEFAULT (static)

  Triangle::coords_t;         //< access MEMBER type
  Shape::coords_t;            //< access DEFAULT type (No way to know impl for which type)
  Triangle::Shape::coords_t;  //< access IMPLEMENTATION or MEMBER or DEFAULT type
  t.Shape::coords_t;          //< same (we cheat using the object to get the implementing type)
}

In implementation context all calls are separated, there is no fallback mapping, calling the interface returns the default implementation.

Having each implementation be addressable separately allows us to call it directly when implementing an interface. 

struct Triangle {
  int area() const
};

Triangle : Shape {
  ...
  int area() const { return  Triangle::area() ; } //< call member impl
// or 
// int area() const { return Shape::area() / 2; } //< call default impl  
};

We need a call with fallback in standard context for practical and ergonomic reasons, not technical as we know the type already. 
But we also need a direct call when implementing the concept, so we can reuse code. 

Sidenote. In standard context, if we don't have fallback, each library will have to create wrappers that do the fallback, much like today. The creation of those wrappers would be trivial, unlike today (see example at the end), but is still one extra annoying step. 
Not only that, it will obfuscate the real interface, which is the class concept. Lastly, we need this particular call to be the same as in the dependent type context, described below.

In dependent type context things change drastically, which is the whole point, as our main goal is to have a system, which makes writing algorithm templates more natural. 

template<Shape T>
void func(T t) {
  t.area();                  //< calls IMPLEMENTATION or MEMBER or DEFAULT   
  t.T::area();               //< same
  t.Triangle::area();        //< same
  t.Shape::area();           //< same

  T::is_interesection               //< calls IMPLEMENTATION  or MEMBER or DEFAULT (static)
  Triangle::is_interesection        //< same
  Shape::is_interesection           //< same

  Triangle::coords_t;         //< access IMPLEMENTATION or MEMBER or DEFAULT type
  Shape::coords_t;            //< access DEFAULT type (No way to know impl for which type)
  Triangle::Shape::coords_t;  //< access IMPLEMENTATION or MEMBER or DEFAULT type
  t.Shape::coords_t;          //< same (we cheat using the object to get the implementing type)
}

int main() {
  Triangle t;
  func(t);   

  return 0;      
}

In other words, the 3 implementations are collapsed. There is no access to members of the same name, unless they are an implementation. The class concept completely takes over. All calls will resolve to the implementation with a fallback. 
This is similar to calling a virtual function either via its base pointer or the most derived one - the result is the same. 

This behaviour is highly desirable - if a type is constrained by a class concept it no longer has the final word about names, defined in that concept! 
This guarantees, the author of the algorithm (`func`) has full control of what those names mean - a key requirement for an extension point. The algorithm is defined in terms of the class concept. 

The system presented above makes writing templates as convenient as possible, while still retaining both predictability and convenience (for both author and user). 
The only main fundamental limitation is that free functions are NOT supported directly
This is, if the author wants to create an algorithm, which does not use member function calls, he/she must use static members. This is a hard technical requirement in order to avoid all the ADL pitfalls.  
And we do avoid them, because the class concept is an owner of the functions. They are not free. Because functions have an owner, and this owner is not the type, mapping is possible. 
With mapping possible, the all is possible - default impl (map to "base" code), member impl (map to concrete class class, after opt in), external impl (map to implementation block code).
All this job is done by the class concept, and in the case of a non-member call, we need to name that class concept. That being said, one can create trivial free wrappers, if wanted (see last example).

Another important benefit of the approach is that it removes the "free vs member" debate completely
The author is free to choose style (sans "free" functions, must be prefixed by T::) and the user is free to implement outside the class, no matter the style.
This is the best compromise we can make.


--- Examples --- 

Example swap

namespace std {
struct concept object {
  static void swap(object& a, object& b) {
    auto tmp = std::move(a);
    a =  std::move(b);
    b =  std::move(tmp);
  }
  
  // object might have other fundamental implementations
};
} // std

class MyObject {
  ...
  void swap();
};

MyObject : std::object {
  static void swap(MyObject& a,  MyObject& b) { a.swap(b); }
};

// --- Usage

MyObject a,b;
std::object::swap(a, b); //< will use  MyObject  IMPLEMENTATION (with fallback)

template<class T>  //< *note*, does not NEED to say object T!!! 
void func(T& a, T& b) {
  std::object::swap(a,b); //< will use T IMPLEMENTATION (with fallback)
}

Example hash

namespace std {
struct concept object {
  size_t hash() const; //< no default implementation
  
  // object might have other fundamental implementations
};

// provide impl for integers

template<std::integral T>
T : object {
  size_t hash() const {
    ...
  }
  // other object functions if needed, or leave to default
};

} // std

class MyObject {
    int row;
    int col;
};

MyObject : std::object {
  size_t hash() const {
    std::size_t h1 = row.std::object::hash();
    std::size_t h2 = col.std::object::hash();

    return h1 ^ (h2 << 1);
  }
};


If we want, we can provide a convenient wrapper

namespace std {
...
inline const auto hash = [](const object auto& o) { return o.hash(); };
}

// User code

int main() {

  int i;

  std::hash(i); //< will call DEFAULT impl

  MyObject o;

  std::hash(o); //< will call MyObject IMPLEMENTATION

  return 0;
}

--- Future direction ---

Concept classes, much like other classes could have virtual functions. This allows us to have classes usabe with both static and dynamic polymorphism, with the same interface. 

struct concept Shape {
  ..
  virtual coords_t width();
  virtual coords_t height();
  ...
}

struct Triangle : Shape {
  ...
  coords_t width() override;
  coords_t height() override;
}

void use_statically(Shape auto& sh);

void use_dynamically(Shape& sh);

Of course, all the limitations of abstract classes apply (if used via type-erased interface) - no associated types, no static functions and constants can not be overridden, no impl outside class, etc etc. 
Still, if one needs type erasure at some point, this is a better option to have than inviting another abstract base class that does the same thing. Also, this way classes, which initially used dynamic dispatch only, can be migrated to a static one.
Implementation outside class is not proposed - this is a normal abstract base class subclassing + the possibility to use it as a concept constraint.  


--
Std-Proposals mailing list
Std-Proposals_at_[hidden]
https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals

Received on 2023-06-19 04:57:40