ISOCPP std-proposals List: [std-proposals] Pre-Draft Proposal for elaborate extension points

From: Михаил Найденов <mihailnajdenov_at_[hidden]>
Date: Sun, 11 Jun 2023 14:51:55 +0300

Hello, here is an outline of a system, that servese to
- be a reliable and easy-to-use "extension point" mechanism
- bridge the gap b/w virtual functions and constrained templates

(Spoiler, it is similar to Circle's impl, but there are differences, some
of them significant)

First let's step back and take a look at one of the most widespread ways to
write abstract code - the virtual class.

Given
void func(Car s);
to make the above work on any Car-like, one must copy the interface of Car
into a new class, with a new name, and make some or all functions virtual.
Then the original Car, as well as all users, must subclass the new class,
let's call it Vehicle, and also change the param of func to take Vehicle by
reference.
Last, but not least, we are forced into member-style syntax.

This is both easy and hard, depending on the view point and is also
powerful and limiting, depending on the viewpoint.
One thing is certain, for better or worse, this model is a *monumental *success
overall (not just C++, but overall). This tells us, it is not "hard",
compared to its benefits and is also powerful enough.

Interestingly, stepping inside templates, we gain "easiness" - no more
interfaces!, and also power - we can abstract associated types, we are not
limited by call expressions style, etc.
Or so it seemed. In practice, after decades of experience, it turned out,
to write a correct algorithm template, where the author both gives as much
flexibility to the user while at the same time has guarantees of a reliable
implementation,
is way, waaaaaay harder than using abstract classes. Not only that, the
original algorithm will be buried inside all the extra machinery needed for
the correct templatization.

Ideally, a template based generic function should need similar effort,
compared to abstract classes.
*Only then, it can be a real alternative to it. *If not, correct templates
will stay with the experts, the rest of the world will continue to use
naive templates and abstract classes *as it is the case today*.

How can we improve that? First thing is to build upon what is known and
understandable, then provide a simple step to gain more power on top of
that.
*This is the model that made abstract classes so successful! *
>From the well known class (the understandable), make a simple step (add the
keyword virtual) and gain considerable power.

*This is our goal. Have the same effort, the same simplicity as with
virtual classes. *

Back to our "customization point" case, given what we know and understand -
the class:

struct Printer {
  using Canvas = Screen;
  static void connect(Printer&, Device&);
  void print(string s) const;
}

Add the keyword concept

struct concept Printer {
  ...
}

G*ain power right away*:

In a dependent context,
- you can have different Printers, each with a different Canvas
(associated types),
- have different static functions and of course different print()
implementations.
- classes do not *need* to subclass Printer (but they can)
*Most of these are not possible with abstract base classes, yet the effort
was similar,* *so there is an incentive!*

On the algorithm definition side the effort is similar to abstract classes
as well:

void func(Printer auto& p) { //< printer is not a concrete class any more
  // use Printer
}

As you can see, Printer *is (also) a concept. *
It will act similarly (but not the same!) to a compiler-generated
expression-based concept.

There are two important differences, which are the reason why class
concepts do not replace regular concepts, but amend them.

The first thing is not required, but highly desirable.
Upon instantiation, the compiler *should* check class concept, so that to
know *what member is what*, ending the need of typename and template inside
function bodies! Again, not required, but an embersament without it.
Full definition checking* is not proposed.*

The second thing *is required* and is one of the things that makes class
concepts suitable as extension points*.*
*All member names (static and instance) belong to the concept class. *The
type can NOT provide an overload to the given name. (The concrete class is
not the master here, it is the class concept.)

void func(Printer auto& p) {
  p.print("hi"); //< OK
  p.print(12); //< ERROR, print is only what Printer says it is!!!
  p.something_else(); //< OK, we don't care
}

The reason for the above change is that class concepts are *nominal
requirements - **a must* for any extension point implementation*. *
They *complement* regular concepts (*structural requirements*). In fact,
the following is perfectly valid:

template<class T>
concept MovablePrinter = Printer && std::movable<T>;

As said, a class concept is a kind-of-concept, if it is not already obvious
by the name, with the template argument being implicit.

Let's look at the other half of a class concept, the "class" part.
It provides something regular concepts do not have - a place to have an
implementation!

struct concept Printer {
  using Canvas = Screen; //< assoc type (default)
  ...
  void print(string s) const { std::cout << s << "\n; } //< print
implementation (default)
}

This way, a class implementing the interface does not need to provide all
the implementations (like it is the case with template specialization). It
can, however, still call the default implementation.

A class concept is a limited class.
In particular, its `this` pointer is of unknown type, effectively a
template param.
It also can not have instance members, because, well, there are no
instances of it - it can not be instantiated directly. The only way to
instantiate it is as a base implementation for another, concrete class,
similar to purely abstract base class.

This can be normal subclassing:

struct concept Shape {
  using coords_t = int;
  static bool is_interesection(const Shape&, const Shape&) { //< Shape
here is a stand-in for the implementing type.
    /*check bounding box intersection*/
  }
  coords_t width() const;
  coords_t height() const;
  int area() const { return width() * height(); }
  ...
};

// Subclass

class RectF : Shape {
  ...
  using coords_t = float;
   coords_t width() const { this->getWidth(); }
  coords_t height() const { this->getWidth(); }
  ...
};

Even at this limited usage, we gain additional power, compared to standard
subclassing:

void func(Shape auto sh) {
  // use Shape

  // in particular Shape::is_interesection will be called with the
subclassed type
  // in contrast to normal subclassing
}

RectF r;
func(r); //< is_interesection is called with RectF

*However, the implementation can also be external!*

Existing class:

struct RectF {
  ...
};

Add shape confirmation:

RectF : Shape {
  ...
  coords_t width() const { this->getWidth(); }
  coords_t height() const { this->getWidth(); }
};

The use is unchanged:

void func(Shape auto sh) {
  // use Shape: Assoc type, static and instance methods
  // They will resolve to either
  // - the class concept default implementation
  // - or the concrete class external implementation
  // - or the concrete class members, in case of an empty external
implementation and matching member names.
}

This might be similar to template specialization, with a different syntax,
but is not.

Concretely, implementers *can call the default implementation*:

struct Triangle {
  ...
};

Triangle : Shape {
  ...
  int area() const { return Shape::area() / 2; } //< call default impl
};

This is because a class, a class concept and an implementation are
*three* *different
entities. *In a specialization we have two.

Now the question is how are the calls found.

This question has different answers, depending on the context.

In a *standard context:*.

int main() {
  Triangle t,b;
  t.area(); //< calls MEMBER function
  t.Triangle::area(); //< same
  t.Shape::area(); //< calls IMPLEMENTATION or MEMBER or DEFAULT

  Triangle::is_interesection(t, b); //< calls MEMBER (static)
  Shape::is_interesection(t, b); //< calls IMPLEMENTATION or MEMBER
or DEFAULT (static)

  Triangle::coords_t; //< access MEMBER type
  Shape::coords_t; //< access DEFAULT type (No way to know impl
for which type)
  Triangle::Shape::coords_t; //< access IMPLEMENTATION or MEMBER or DEFAULT
type
  t.Shape::coords_t; //< same (we cheat using the object to get
the implementing type)
}

In* implementation context* all calls are separated, there is no fallback
mapping, calling the interface returns the default implementation.

Having each implementation be addressable separately allows us to call it
directly when implementing an interface.

struct Triangle {
  int area() const
};

Triangle : Shape {
  ...
  int area() const { return Triangle::area() ; } //< call member impl
// or
// int area() const { return Shape::area() / 2; } //< call default impl
};

We need a call with fallback in standard context for practical and
ergonomic reasons, not technical as we know the type already.
But we also need a direct call when implementing the concept, so we can
reuse code.

Sidenote. In standard context, if we don't have fallback, each library will
have to create wrappers that do the fallback, much like today. The creation
of those wrappers would be *trivial*, unlike today (see example at the end),
but is still one extra annoying step.
Not only that, it will *obfuscate* the real interface, which is the class
concept. Lastly, we need this particular call to be the same as in the
*dependent
type context*, described below.

In *dependent type context* things change drastically, which is the whole
point, as our main goal is to have a system, which makes writing
algorithm templates more natural.

template<Shape T>
void func(T t) {
  t.area(); //< calls IMPLEMENTATION or MEMBER or DEFAULT

  t.T::area(); //< same
  t.Triangle::area(); //< same
  t.Shape::area(); //< same

  T::is_interesection //< calls IMPLEMENTATION or MEMBER or
DEFAULT (static)
  Triangle::is_interesection //< same
  Shape::is_interesection //< same

  Triangle::coords_t; //< access IMPLEMENTATION or MEMBER or DEFAULT
type
  Shape::coords_t; //< access DEFAULT type (No way to know impl
for which type)
  Triangle::Shape::coords_t; //< access IMPLEMENTATION or MEMBER or DEFAULT
type
  t.Shape::coords_t; //< same (we cheat using the object to get
the implementing type)
}

int main() {
  Triangle t;
  func(t);

  return 0;
}

In other words, the 3 implementations are *collapsed*. There is *no access*
to members of the same name, *unless *they are an implementation. The class
concept completely takes over. All calls will resolve to the implementation
with a fallback.
This is similar to calling a virtual function either via its base pointer
or the most derived one - the result is the same.

This behaviour is highly desirable - if a type is constrained by a class
concept *it no longer has the **final **word about names, defined in that
concept! *
This *guarantees*, the author of the algorithm (`func`) has *full control
of what those names mean* - a key requirement for an *extension point*. The
algorithm is defined *in terms of *the class concept.

The system presented above makes writing templates as convenient as
possible, while still retaining both predictability and convenience (for
both author and user).
The only main fundamental limitation is that free functions are NOT
supported *directly*.
This is, if the author wants to create an algorithm, which does not use
member function calls, he/she must use static members. This is a hard
technical requirement in order to avoid all the ADL pitfalls.
And we do avoid them, because the class concept is an *owner* of the
functions. They are not free. Because functions have an owner, and this
owner is not the type, mapping is possible.
With mapping possible, the all is possible - default impl (map to "base"
code), member impl (map to concrete class class, after opt in), external
impl (map to implementation block code).
All this job is done by the class concept, and in the case of a non-member
call, we need to *name* that class concept. That being said, one can create
trivial free wrappers, if wanted (see last example).

Another important benefit of the approach is that it* removes the "free vs
member" debate completely*.
The author is free to choose style (sans "free" functions, must be prefixed
by T::) and the user is free to implement outside the class, no matter the
style.
This is the best compromise we can make.

*--- **Examples --- *

*Example swap*

namespace std {
struct concept object {
  static void swap(object& a, object& b) {
    auto tmp = std::move(a);
    a = std::move(b);
    b = std::move(tmp);
  }

  // object might have other fundamental implementations
};
} // std

class MyObject {
  ...
  void swap();
};

MyObject : std::object {
  static void swap(MyObject& a, MyObject& b) { a.swap(b); }
};

// --- Usage

MyObject a,b;
std::object::swap(a, b); //< will use MyObject IMPLEMENTATION (with
fallback)

template<class T> //< *note*, does not NEED to say object T!!!
void func(T& a, T& b) {
  std::object::swap(a,b); //< will use T IMPLEMENTATION (with fallback)
}

*Example hash*

namespace std {
struct concept object {
  size_t hash() const; //< no default implementation

  // object might have other fundamental implementations
};

// provide impl for integers

template<std::integral T>
T : object {
  size_t hash() const {
    ...
  }
  // other object functions if needed, or leave to default
};

} // std

class MyObject {
    int row;
    int col;
};

MyObject : std::object {
  size_t hash() const {
    std::size_t h1 = row.std::object::hash();
    std::size_t h2 = col.std::object::hash();

    return h1 ^ (h2 << 1);
  }
};

If we want, we can provide a *convenient wrapper*

namespace std {
...
inline const auto hash = [](const object auto& o) { return o.hash(); };
}

// User code

int main() {

  int i;

  std::hash(i); //< will call DEFAULT impl

  MyObject o;

  std::hash(o); //< will call MyObject IMPLEMENTATION

  return 0;
}

*--- Future direction ---*

Concept classes, much like other classes *could have virtual functions.* This
allows us to have classes usabe with both static and dynamic polymorphism,
with the same interface*.*

struct concept Shape {
  ..
  virtual coords_t width();
  virtual coords_t height();
  ...
}

struct Triangle : Shape {
  ...
  coords_t width() override;
  coords_t height() override;
}

void use_statically(Shape auto& sh);

void use_dynamically(Shape& sh);

Of course, all the limitations of abstract classes apply (if used via
type-erased interface) - no associated types, no static functions and
constants can not be overridden, no impl outside class, etc etc.
Still, if one needs type erasure at some point, this is a better option to
have than inviting another abstract base class that does the same thing.
Also, this way classes, which initially used dynamic dispatch only, can be
migrated to a static one.
Implementation outside class is not proposed - this is a normal abstract
base class subclassing + the possibility to use it as a concept constraint.

Received on 2023-06-11 11:52:08