[linear.algebra]R1-Draft
A proposal to add linear algebra support to the C++ standard library

Draft Proposal,

This version:
https://github.com/BobSteagall/wg21/blob/master/linear_algebra/papers/P1385R1-draft.bs
Latest published version:
http://wg21.link/P1385
Authors:
Project:
ISO/IEC JTC1/SC22/WG21 14882: Programming Language — C++

Abstract

This document proposes an interface specification for a set of fundamental linear algebra facilities in the standard C++ library.

1. Introduction

Linear algebra is a mathematical discipline of ever-increasing importance, with direct application to a wide variety of problem domains, such as signal processing, computer graphics, medical imaging, scientific simulations, machine learning, analytics, financial modeling, and high-performance computing. And yet, despite the relevance of linear algebra to so many aspects of modern computing, the C++ standard library does not include a set of linear algebra facilities. This paper proposes to remedy this deficit for C++23.

This paper should be read after P1166, in which we describe a high-level set of expectations for what a linear algebra library should contain.

2. Goals

We expect that typical users of a standard linear algebra library are likely to value two features above all else: ease-of-use (including expressiveness), and high performance. This set of users will expect the ability to compose arithmetical expressions of linear algebra objects similarly to what one might find in a textbook; indeed, this has been deemed a "must-have" feature by several participants in recent SG14 Linear Algebra SIG conference calls. And for a given arithmetical expression, they will expect run-time computational performance that is close to what they could obtain with an equivalent sequence of function calls to a more "traditional" linear algebra library, such as LAPCK, Blaze, Eigen, etc.

There also exists a set of linear algebra “super-users” who will value most highly a third feature – the ability to customize underlying infrastructure in order to maximize performance for specific problems and computing platforms. These users seek the highest possible run-time performance, and to achieve it, require the ability to customize any and every portion of the library’s computational infrastructure.

With these high-level user requirements in mind, we propose an interface specification intended to achieve the following goals:

  1. To provide a set of vocabulary types for representing the mathematical objects and operations that are relevant to linear algebra;

  2. To provide a public interface for linear algebra operations that is intuitive, teachable, and mimics the expressiveness of mathematical notation to the greatest reasonable extent;

  3. To exhibit out-of-the-box performance rivalling that of an equivalent sequence of function calls to a more traditional linear algebra library, such as LAPACK, Blaze, Eigen, etc.;

  4. To provide a set of building blocks that manage the source, ownership, lifetime, layout, and access to the memory required to represent the linear algebra vocabulary types, with the requirement that these building blocks are also suitable for eventually representing other interesting mathematical entities, such as quaternions, octonions, and tensors; and,

  5. To provide straightforward facilities and techniques for customization that enable users to optimize performance for their specific problem domain on their specific hardware.

3. Definitions

When discussing linear algebra and related topics for a proposal such as this, it is important to note that there are several overloaded terms (such as matrix, vector, dimension, and rank) which must be defined and disambiguated if such discussions are to be productive. These terms have specific meanings in mathematics, as well as different, but confusingly similar, meanings to C++ programmers.

In the following sections we provide definitions for relevant mathematical concepts, C++ type design concepts, and describe how this proposal employs those overloaded terms in various contexts.

3.1. Mathematical Terms

In order to facilitate subsequent discussion, we first provide the following informal set of definitions for important mathematical concepts:

  1. A vector space is a collection of vectors, where vectors are objects that may be added together and multiplied by scalars. Euclidean vectors are an example of a vector space, typically used to represent displacements, as well as physical quantities such as force or momentum. Linear algebra is concerned primarily with the study of vector spaces.

  2. The dimension of a vector space is defined as the minimum number of coordinates required to specify any point within the space.

  3. A matrix is a rectangular array of numbers, symbols, or expressions, arranged in rows and columns. A matrix having m rows and n columns is said to have size m x n. Although matrices can be used solve systems of simultaneous linear equations, they are most commonly used to represent linear transformations, solve linear least squares problems, and to explore and/or manipulate the properties of vector spaces.

  4. The rank of a matrix is the dimension of the vector space spanned by its columns, which is equal to the dimension of the vector space spanned by its rows. The rank is also equal to the maximum number of linearly-independent columns and rows.

  5. An element of a matrix is an individual member (number, symbol, expression) of the rectangular array comprising the matrix, lying at the intersection of a single row and a single column. In traditional mathematical notation, row and column indexing is 1-based, where rows are indexed from 1 to m and columns are indexed from 1 to n. Given some matrix A, element a11 refers to the element in the upper left-hand corner of the array and element amn refers to the element in the lower right-hand corner.

  6. A row vector is a matrix containing a single row; in other words, a matrix of size 1 x n. In many applications of linear algebra, row vectors represent spatial vectors.

  7. A column vector is a matrix containing a single column; in other words, a matrix of size m x 1. In many applications of linear algebra, column vectors represent spatial vectors.

  8. Element transforms are non-arithmetical operations that modify the relative positions of elements in a matrix, such as transpose, column exchange, and row exchange.

  9. Element arithmetic refers to arithmetical operations that read or modify the values of individual elements independently of other elements, such assigning a value to a specific element or multiplying a row by some value.

  10. Matrix arithmetic refers to the assignment, addition, subtraction, negation, multiplication, and determinant operations defined for matrices, row vectors, and column vectors as wholes.

  11. A rectangular matrix is a matrix requiring a full m x n representation; that is, a matrix not possessing a special form, such as identity, triangular, band, etc.

  12. The identity matrix is a square matrix where all elements on the diagonal are equal to one and all off-diagonal elements are equal to zero.

  13. A triangular matrix is a matrix where all elements above or below the diagonal are zero; those with non-zero elements above the diagonal are called upper triangular, while those with non-zero elements below the diagonal are called lower triangular.

  14. A band matrix is a sparse matrix whose non-zero entries are confined to a diagonal band, lying on the main diagonal and zero or more diagonals on either side.

  15. Decompositions are complex sequences of arithmetic operations, element arithmetic, and element transforms performed upon a matrix that expose important mathematical properties of that matrix. Several types of decomposition are often performed in solving least-squares problems.

  16. Eigen-decompositions are decompositions performed upon a symmetric matrix in order to compute the eigenvalues and eigenvectors of that matrix; this is often performed when solving problems involving linear dynamic systems.

3.2. Terms Pertaining to C++ Types

The following are terms used in this proposal that describe various aspects of how the mathematical concepts described above in Section 3.1 might be implemented:

  1. An array is a data structure representing an indexable collection of objects (elements) such that each element is identified by at least one index. An array is said to be one-dimensional array if its elements are accessible with a single index; a multi-dimensional array is an array for which more than one index is required to access its elements.

  2. The dimension of an array refers to the number of indices required to access an element of that array. The rank of an array is a synonym for its dimension.

  3. This proposal uses the term MathObj to refer generically to one of the C++ types described herein representing matrices, row vectors, and column vectors. These are the public-facing facilities developers will use in their code.

  4. An engine is an implementation type that manages the storage-related aspects of, and access to, the elements of a MathObj. In this proposal, an engine object is a private member of a MathObj. Other than as a template parameter, engines are not part of a MathObj’s public interface.

  5. The adjective dense refers to a MathObj representation where storage is allocated for every element.

  6. The adjective sparse refers to a MathObj representation where storage is allocated only for non-zero elements;

  7. Storage is used by this proposal as a synonym for memory.

  8. Traits refers to a stateless class template that provides some set of services, normalizing those services over its set of template parameters.

  9. Row size and column size refer to the number of rows and columns, respectively, that a MathObj represents, which must be less than or equal to its row and column capacities, defined below.

  10. Row capacity and column capacity refer to the maximum number of rows and columns, respectively, that a MathObj can possibly represent.

  11. Fixed-size (FS) refers to an engine type whose row and column sizes are fixed at instantiation time and constant thereafter.

  12. Fixed-capacity (FC) refers to an engine type whose row and column capacities are fixed at instantiation time and constant thereafter.

  13. Dynamically re-sizable (DR) refers to an engine type whose row and column sizes and capacities may be changed at run time.

3.3. Overloaded Terms

This section describes how we use certain overloaded terms in this proposal and in future works.

3.3.1. Matrix

The term matrix is frequently used by C++ programmers to mean a general-purpose array of arbitrary size. For example, one of the authors worked at a company where it was common practice to refer to 4-dimensional arrays as “4-dimensional matrices.”

In this proposal, we use the word array only to mean a data structure whose elements are accessible using one or more indices, and which has no invariants pertaining to higher-level or mathematical meaning.

We use matrix to mean the mathematical object as defined above in Section 3.1, and matrix (in monospaced font) to mean the C++ class template that implements the mathematical object. We sometimes use MathObj (in monospaced font) in some of the component interface code and text below to generically refer to a matrix, row_vector, or column_vector object.

3.3.2. Vector

Likewise, many C++ programmers incorrectly use the term vector as a synonym for “dynamically re-sizable array.” This bad habit is exacerbated by the unforgivably awful naming of std::vector.

This proposal uses the term vector to mean an element of a vector space, per Section 3.1. Further, row vector and column vector have the meanings set out in 3.1, while row_vector and column_vector (in monospaced font) are the C++ class templates implementing those mathematical objects, respectively. We sometimes use MathObj (in monospaced font) in some of the component code interface code and test below to generically refer to a row_vector, column_vector, or matrix object.

3.3.3. Dimension

In linear algebra, a vector space V is said to be of dimension n, or n-dimensional, if there exist n linearly independent vectors which span V. This is another way of saying that n is the minimum number of coordinates required to specify any point in V. However, in common programming parlance, dimension refers to the number of indices used to access an element in an array.

We use the term dimension both ways in this proposal but try to do so consistently and in a way that is clear from the context. For example, a matrix describing a rotation in a 3D virtual reality application is an example of a 2-dimensional data structure containing 3-dimensional row and column vectors. A vector describing an electric field is an example of a 1-dimensional data structure implemented as a 3-dimensional row vector.

3.3.4. Rank

The rank of a matrix is the dimension of the vector space spanned by its columns (or rows), which corresponds to the maximal number of linearly independent columns (or rows) of that matrix. Rank also has yet another meaning in tensor analysis, where it is commonly used as a synonym for a tensor’s order.

However, rank also has a meaning in computer science where it is used as a synonym for dimension. In the C++ standard at [meta.unary.prop.query], rank is described as the number of dimensions of T if T names an array, otherwise it is zero.

We avoid using the term rank in this proposal in the context of linear algebra, except as a quantity that might result from performing certain decompositions.

4. Scope

We contend that the best approach for standardizing a set of linear algebra components for C++23 will be one that is layered, iterative, and incremental. This paper is quite deliberately a “basic linear algebra-only” proposal; it describes what we believe is the minimum set of components necessary to provide a certain basic level of functionality. Higher-level functionality can be specified in terms of the interfaces described here, and we encourage succession papers to explore this possibility.

4.1. Functionality

The foundational layer, as described here, should include the minimal set of types and functions required to perform matrix functions in finite dimensional spaces. This includes:

4.2. Considered but Excluded

Tensors

There has been a great deal of interest expressed in specifying an interface for general-purpose tensor processing in which linear algebra facilities fall out as a special case. We exclude this idea from this proposal for two reasons. First, given the practical realities of standardization work, the enormous scope of such an effort would very likely delay introduction of linear algebra facilities until C++26 or later.

Second, and more importantly, implementing matrices as derived types or specializations of a general-purpose tensor type is bad type design. Consider the following: a tensor is (informally) an array of mathematical objects (numbers or functions) such that its elements transform according to certain rules under a coordinate system change. In a p-dimensional space, a tensor of rank n will have pn elements. In particular, a rank-2 tensor in a p-dimensional space may be represented by a p x p matrix having certain properties related to coordinate transformation not possessed by all p x p matrices.

These defining characteristics of a tensor lead us to the crux of the issue: every rank-2 tensor can be represented by a square matrix, but not every square matrix represents a tensor. As one quickly realizes, only a small fraction of all possible matrices are representations of rank-2 tensors.

All of this is a long way of saying that the class invariants governing a matrix type are quite different from those governing a tensor type, and as such, the public interfaces of such types will also differ substantially.

From this we conclude that matrices are not Liskov-substitutable for rank-2 tensors, and therefore as matter of good type design, matrices and tensors should be implemented as distinct types, perhaps with appropriate inter-conversion operations.

This situation is analogous to the age-old object-oriented design question: when designing a group of classes that represent geometric shapes, is a square a kind of rectangle? In other words, should class square be publicly derived from class rectangle? Mathematically, yes, a square is a rectangle. But from the perspective of good interface design, class square is not substitutable for class rectangle and is usually best implemented as a distinct type having no IS-A relationship with rectangle.

Quaternions and Octonions

There has also been interest expressed in including other useful mathematical objects, such as quaternions and octonions, as part of a standard linear algebra library. Although element storage for these types might be implemented using the engines described in this proposal, quaternions and octonions represent mathematical concepts that are fundamentally different from those of matrices and vectors.

As with tensors, the class invariants and public interfaces for quaternions and octonions would be substantially different from that of the linear algebra components. Liskov substitutability would not be possible, and therefore quaternions and octonions should be implemented as a set of types distinct from the linear algebra types.

5. Design Aspects

The following describe several important aspects of the problem domain affecting the design of the proposed interface. Importantly, these aspects are orthogonal, and are addressable through judicious combinations of template parameters and implementation type design.

5.1. Memory Source

Perhaps the first question to be answered is that of the source of memory in which elements will reside. One can easily imagine multiple sources of memory:

5.2. Addressing Model

It is also possible that the memory used by a MathObj might be addressed using what the standard calls a pointer-like type, also known as a fancy pointer.

For example, consider an element buffer existing in a shared memory segment managed by a custom allocator. In this case, the allocator might employ a fancy pointer type that performs location-independent addressing based on a segment index and an offset into that segment.

One can also imagine a fancy pointer that is a handle to a memory resource existing somewhere on a network, and addressing operations require first mapping that resource into the local address space, perhaps by copying over the network or by some magic sequence of RPC invocations.

5.3. Memory Ownership

The next important questions pertain to memory ownership. Should the memory in which elements reside be deallocated, and if so, what object is responsible for performing the deallocation?

A MathObj might own the memory in which it stores its elements, or it might employ some non-owning view type, like mdspan, to manipulate elements owned by some other object.

5.4. Capacity and Resizability

As with std::string and std::vector, it is occasionally useful for a MathObj to have excess storage capacity in order to reduce the number of re-allocations required by future resizing operations. Some linear algebra libraries, like LAPACK, account for the fact that a MathObj’s capacity may be different than its size. This capability was of critical importance to the success of one author’s prior work in functional MRI image analysis.

In other problem domains, like computer graphics, MathObjs are small and always of the same size. In this case, the size and capacity are equal, and there is no need for a MathObj to maintain or manage excess capacity.

5.5. Element Layout

There are many ways to arrange the elements of a matrix in memory, the most common in C++ being row-major dense rectangular. In Fortran-based libraries, the two-dimensional arrays used to represent matrices are usually column-major. There are also special arrangements of elements for upper/lower triangular and banded diagonal matrices that are both row-major and column-major. These arrangements of elements have been well-known for many years, and libraries like LAPACK in the hands of a knowledgeable user can use them to implement code that is optimal in both time and space.

5.6. Element Access and Indexing

In keeping with the goal of supporting a natural syntax, and in analogy with the indexing operations provided by the random-access standard library containers, it seems reasonable to provide both const and non-const indexing for reading and writing individual elements.

However, support for element indexing raises an important question: should MathObjs employ 1-based indexing or 0-based indexing? 1-based indexing is the convention used in mathematical notation (and Fortran), whereas 0-based indexing is “the C++ way.”

5.7. Element Type

C++ supports a relatively narrow range of arithmetic types, lacking direct support for arbitrary precision numbers and fixed-point numbers, among others. Libraries exist to implement these types, and they should not be precluded from use in a standard linear algebra library. It is possible that individual elements of a MathObj may allocate memory, and therefore an implementation cannot assume that element types have trivial constructors or destructors.

5.8. Mixed-Element-Type Expressions

In general, when multiple built-in arithmetic types are present in an arithmetical expression, the resulting type will have a precision greater than or equal to that of the type with greatest precision in the expression. In other words, to the greatest reasonable extent, information is preserved.

A similar principal should apply to expressions involving MathObjs where more than one element type is present. Arithmetic operations involving MathObjs should, to the greatest reasonable extent, preserve element-wise information.

For example, just as the result of multiplying a float by a double is a double, the result multiplying a matrix-of-float by a matrix-of-double should be a matrix-of-double. We call the process of determining the resulting element type element promotion.

5.9. Mixed-Engine Expressions

In analogy with element type, MathObj expressions may include mixed storage management strategies as implemented by their corresponding engine types. For example, consider the case of a fixed-size matrix multiplied by a dynamically-resizable matrix. What is the engine type of the resulting matrix?

Expression involving mixed engine types should not limit the availability of basic arithmetic operations. This means that there should be a mechanism for determining the engine type of the resulting from such expressions. We call the process of determining the resulting engine type engine promotion.

We contend that in most cases, the resulting engine type should be at least as "general" as the most "general" of the two engine types. For example, one could make the argument that a dynamically-resizable engine is more general that a fixed-size engine, and therefore an the resulting engine type in an expression involving both these engine types should be a dynamically-resizable engine.

5.10. Concurrency and Parallelism

In pursuit of optimal performance, developers may want to use multiple cores to carry out multiplication on very large pairs of matrices, particularly in situations where the operation is used to produce a third matrix rather than modify one of the operands. The matrix multiplication operation is highly amenable to this approach, since a thread may be used for each row of the source matrix.

Developers may also wish to make use of SIMD intrinsics to enable parallel evaluation of matrix multiplication. This is common in game development environments where programs are written for very specific platforms, where the make and model of processor is well defined. This would impact on element layout and storage. Such work has already been demonstrated in paper N4454.

5.11. Linear Algebra and constexpr

The fundamental set of operations for linear algebra can all be implemented in terms of a subset of the algorithms defined in the <algorithm> header, all of which are marked constexpr since C++20. Matrix and vector initialization is of course also possible at compile time.

The arrival of std::is_constant_evaluated in C++20 makes it possible to offer a constexpr implementation of the operations, allowing customizations to defer to them in constexpr evaluations while taking the customization’s notionally superior run-time path in alternate situations.

6. Interface Description

In this section, we describe the various types, operators, and functions comprising the proposed interface. The reader should note that the descriptions below are by no means ready for wording; rather, they are intended to foster further discussions and refinements.

6.1. Engine Types and Supporting Traits

All of the engine types provide a basic interface for const element indexing, row and column sizes, and row and column capacities. They also export public type aliases which specify their element type, whether or not they are resizable, and a 2-tuple for containing sizes and capacities.

6.1.1. Fixed-Size Engine

Class template fs_matrix_engine<T, R, C> implements a fixed-size, fixed-capacity engine. In addition to the basic engine interface, it provides member functions for mutable element indexing and swapping rows and/or columns.

template<class T, size_t R, size_t C>
class fs_matrix_engine
{
    static_assert(is_matrix_element_v<T>);
    static_assert(R >= 1);
    static_assert(C >= 1);

  public:
    using element_type      = T;
    using is_resizable_type = false_type;
    using size_tuple        = tuple<size_t, size_t>;

  public:
    fs_matrix_engine();
    fs_matrix_engine(fs_matrix_engine&&);
    fs_matrix_engine(fs_matrix_engine const&);

    fs_matrix_engine&   operator =(fs_matrix_engine&&);
    fs_matrix_engine&   operator =(fs_matrix_engine const&);

    T           operator ()(size_t i) const;
    T           operator ()(size_t i, size_t j) const;
    T const*    data() const;

    size_t      columns() const noexcept;
    size_t      rows() const noexcept;
    size_tuple  size() const noexcept;

    size_t      column_capacity() const noexcept;
    size_t      row_capacity() const noexcept;
    size_tuple  capacity() const noexcept;

    T&      operator ()(size_t i);
    T&      operator ()(size_t i, size_t j);
    T*      data();

    void    swap_columns(size_t i, size_t j);
    void    swap_rows(size_t i, size_t j);

  private:
    T       ma_elems[R*C];  //- for exposition
    T*      mp_bias;        //- bias pointer for 1-based indexing; for exposition
};

6.1.2. Dynamically-Resizable Engine

Class template dyn_matrix_engine<T, ALLOC> implements an engine whose sizes and capacities can be changed at runtime. In addition to the basic engine interface, it provides member functions for mutable element indexing, changing size and capacity, and swapping rows and/or columns.

template<class T, class ALLOC = std::allocator<T>>
class dyn_matrix_engine
{
    static_assert(is_matrix_element_v<T>);

  public:
    using element_type      = T;
    using is_resizable_type = true_type;
    using size_tuple        = tuple<size_t, size_t>;

  public:
    dyn_matrix_engine();
    dyn_matrix_engine(dyn_matrix_engine&&);
    dyn_matrix_engine(dyn_matrix_engine const&);

    dyn_matrix_engine&  operator =(dyn_matrix_engine&&);
    dyn_matrix_engine&  operator =(dyn_matrix_engine const&);

    T           operator ()(size_t i) const;
    T           operator ()(size_t i, size_t j) const;
    T const*    data() const;

    size_t      columns() const noexcept;
    size_t      rows() const noexcept;
    size_tuple  size() const noexcept;

    size_t      column_capacity() const noexcept;
    size_t      row_capacity() const noexcept;
    size_tuple  capacity() const noexcept;

    T&      operator ()(size_t i);
    T&      operator ()(size_t i, size_t j);
    T*      data();

    void    reserve(size_tuple cap);
    void    reserve(size_t rowcap, size_t colcap);

    void    resize(size_tuple size);
    void    resize(size_t rows, size_t cols);
    void    resize(size_tuple size, size_tuple cap);
    void    resize(size_t rows, size_t cols, size_t rowcap, size_t colcap);

    void    swap_columns(size_t i, size_t j);
    void    swap_rows(size_t i, size_t j);

  private:
    using pointer = typename std::allocator_traits<ALLOC>::pointer;

    pointer     mp_elems;   //- for exposition
    T*          mp_bias;    //- bias pointer for 1-based indexing; for exposition
    size_t      m_rows;
    size_t      m_cols;
    size_t      m_rowcap;
    size_t      m_colcap;
};

6.1.3. Transpose Engine

Class template matrix_transpose_engine<ENG> implements a non-owning, const view type that provides the basic engine interface. Its primary anticipated use case is as the return value of the tr() member function of the MathObj types.

template<class ENG>
class matrix_transpose_engine
{
  public:
    using engine_type       = ENG;
    using element_type      = typename engine_type::element_type;
    using is_resizable_type = false_type;
    using size_tuple        = typename engine_type::size_tuple;

  public:
    matrix_transpose_engine();
    matrix_transpose_engine(engine_type const& eng);
    matrix_transpose_engine(matrix_transpose_engine&&);
    matrix_transpose_engine(matrix_transpose_engine const&);

    matrix_transpose_engine&    operator =(matrix_transpose_engine&&);
    matrix_transpose_engine&    operator =(matrix_transpose_engine const&);

    element_type        operator ()(size_t i) const;
    element_type        operator ()(size_t i, size_t j) const;
    element_type const* data() const;

    size_t      columns() const noexcept;
    size_t      rows() const noexcept;
    size_tuple  size() const noexcept;

    size_t      column_capacity() const noexcept;
    size_t      row_capacity() const noexcept;
    size_tuple  capacity() const noexcept;

  private:
    engine_type*    mp_other;
};

6.1.4. Element Promotion Traits

Traits type is_complex<T> determines whether its template argument T is of type std::complex<V> for some type V, where V must itself be an arithmetical type as determined by std::is_arithmetic_v<U>. Defining what constitutes an arithmetic type can be challenging; our intention is that an arithmetic type is one representing a field.

template<class T>  struct is_complex;
template<class T>  constexpr bool is_complex_v = ...;

Traits type is_matrix_element<T> is used in static assertions to ensure that MathObj types are instantiated only with element types representing a field (i.e., arithmetic types, or complex types per above). It uses is_complex<T> to help make that determination.

template<class T>  struct is_matrix_element;
template<class T>  constexpr bool is_matrix_element_v = ...;

Class template matrix_element_promotion<T1, T2> implements a traits type that determines the appropriate result type of an arithmetical operation involving two (possibly different) MathObj element types. Note the partial specializations involving complex<T> elements.

template<class T1, class T2>
struct matrix_element_promotion
{
    using type = ...;
};

template<class T1, class T2>
struct matrix_element_promotion<T1, std::complex<T2>>;

template<class T1, class T2>
struct matrix_element_promotion<std::complex<T1>, T2>;

template<class T1, class T2>
struct matrix_element_promotion<std::complex<T1>, std::complex<T2>>;

template<class T1, class T2>
using matrix_element_promotion_t = typename matrix_element_promotion<T1, T2>::type;

6.1.5. Engine Promotion Traits for Negation

Class template matrix_engine_negate_promotion<E1> implements a traits type that determines the resulting engine type when negating a MathObj.

template<class E1>
struct matrix_engine_negate_promotion
{
    using engine_type = ...;
};

template<class E1>
using matrix_engine_negate_t = typename matrix_engine_negate_promotion<E1>::engine_type;

6.1.6. Engine Promotion Traits for Addition

Class template matrix_engine_add_promotion<E1, E2> implements a traits type that determines the resulting engine type when adding two compatible MathObjs.

template<class E1, class E2>
struct matrix_engine_add_promotion
{
    using engine_type = ...;
};

template<class E1, class E2>
using matrix_engine_add_t = typename matrix_engine_add_promotion<E1, E2>::engine_type;

6.1.7. Engine Promotion Traits for Subtraction

Class template matrix_engine_subtract_promotion<E1, E2> implements a traits type that determines the resulting engine type when subtracting two compatible MathObjs.

template<class E1, class E2>
struct matrix_engine_subtract_promotion
{
    using engine_type = ...;
};

template<class E1, class E2>
using matrix_engine_subtract_t = typename matrix_engine_subtract_promotion<E1, E2>::engine_type;

6.1.8. Engine Promotion Traits for Multiplication

Class template matrix_engine_multiply_promotion<E1, E2> implements a traits type that determines the resulting engine type when multiplying two compatible MathObjs.

template<class E1, class E2>
struct matrix_engine_multiply_promotion;

{
    using engine_type = ...;
};

template<class E1, class E2>
using matrix_engine_multiply_t = typename matrix_engine_multiply_promotion<E1, E2>::engine_type;

6.2. Mathematical Types

This section describes the three main linear algebra object types proposed herein: the class templates column_vector, row_vector, and matrix.

6.2.1. Helpers

Alias template enable_if_resizable_t<ET1, ET2> is a helper used by the MathObj types to manipulate overload resolution sets for member functions that perform dynamic storage management.

template<class ET1, class ET2>
using enable_if_resizable_t =
    typename std::enable_if_t<is_same_v<ET1, ET2> && ET1::is_resizable_type::value, bool>;

6.2.2. Column Vector

Class template column_vector<ENG> provides a representation of a column vector, with element type and storage management implemented by the engine type ENG. If the engine provides dynamic resizing, then this class will as well.

template<class ENG>
class column_vector
{
  public:
    using engine_type       = ENG;
    using element_type      = typename ENG::element_type;
    using is_resizable_type = typename ENG::is_resizable_type;
    using size_tuple        = typename ENG::size_tuple;
    using transpose_type    = row_vector<matrix_transpose_engine<ENG>>;

  public:
    ~column_vector() = default;
    column_vector();
    column_vector(column_vector&&) = default;
    column_vector(column_vector const&) = default;

    template<class ET2>
    column_vector(column_vector<ET2> const& src);

    template<class ET2 = ENG, enable_if_resizable_t<ENG, ET2> = true>
    column_vector(size_t rows);
    template<class ET2 = ENG, enable_if_resizable_t<ENG, ET2> = true>
    column_vector(size_t rows, size_t rowcap);

    column_vector&  operator =(column_vector&&) = default;
    column_vector&  operator =(column_vector const&) = default;

    template<class ET2>
    column_vector&  operator =(column_vector<ET2> const& rhs);

    //- Const element access.
    //
    element_type        operator ()(size_t i) const;
    element_type const* data() const;

    //- Accessors.
    //
    size_t  columns() const noexcept;
    size_t  rows() const noexcept;
    size_t  size() const noexcept;

    size_t  column_capacity() const noexcept;
    size_t  row_capacity() const noexcept;
    size_t  capacity() const noexcept;

    //- Common functions.
    //
    transpose_type  tr() const;

    //- Mutable element access.
    //
    element_type    operator ()(size_t i);
    element_type*   data();

    //- Change capacity.
    //
    template<class ET2 = ENG, enable_if_resizable_t<ENG, ET2> = true>
    void    reserve(size_t rowcap);

    //- Change size.
    //
    template<class ET2 = ENG, enable_if_resizable_t<ENG, ET2> = true>
    void    resize(size_t rows);

    //- Change size and capacity in one shot.
    //
    template<class ET2 = ENG, enable_if_resizable_t<ENG, ET2> = true>
    void    resize(size_t rows, size_t rowcap);

    //- Row operations.
    //
    void    swap_rows(size_t i, size_t j);

  private:
    engine_type     m_engine;   //- for exposition
};

6.2.3. Row Vector

Class template row_vector<ENG> provides a representation of a row vector, with element type and storage management implemented by the engine type ENG. If the engine provides resizing, then this class will as well.

template<class ENG>
class row_vector
{
  public:
    using engine_type       = ENG;
    using element_type      = typename ENG::element_type;
    using is_resizable_type = typename ENG::is_resizable_type;
    using size_tuple        = typename ENG::size_tuple;
    using transpose_type    = column_vector<matrix_transpose_engine<ENG>>;

  public:
    ~row_vector() = default;
    row_vector();
    row_vector(row_vector&&) = default;
    row_vector(row_vector const&) = default;

    template<class ET2>
    row_vector(row_vector<ET2> const& src);

    template<class ET2 = ENG, enable_if_resizable_t<ENG, ET2> = true>
    row_vector(size_t cols);
    template<class ET2 = ENG, enable_if_resizable_t<ENG, ET2> = true>
    row_vector(size_t cols, size_t colcap);

    row_vector&     operator =(row_vector&&) = default;
    row_vector&     operator =(row_vector const&) = default;

    template<class ET2>
    row_vector&     operator =(row_vector<ET2> const& rhs);

    //- Const element access.
    //
    element_type        operator ()(size_t i) const;
    element_type const* data() const;

    //- Accessors.
    //
    size_t  columns() const noexcept;
    size_t  rows() const noexcept;
    size_t  size() const noexcept;

    size_t  column_capacity() const noexcept;
    size_t  row_capacity() const noexcept;
    size_t  capacity() const noexcept;

    //- Common functions.
    //
    transpose_type  tr() const;

    //- Mutable element access.
    //
    element_type    operator ()(size_t i);
    element_type*   data();

    //- Change capacity.
    //
    template<class ET2 = ENG, enable_if_resizable_t<ENG, ET2> = true>
    void    reserve(size_t colcap);

    //- Change size.
    //
    template<class ET2 = ENG, enable_if_resizable_t<ENG, ET2> = true>
    void    resize(size_t cols);

    //- Change size and capacity in one shot.
    //
    template<class ET2 = ENG, enable_if_resizable_t<ENG, ET2> = true>
    void    resize(size_t cols, size_t colcap);

    //- column operations.
    //
    void    swap_columns(size_t i, size_t j);

  private:
    engine_type     m_engine;    //- for exposition
};

6.2.4. Matrix

Class template matrix<ENG> provides a representation of a matrix, with element type and storage management implemented by the engine type ENG. If the engine provides resizing, then this class will as well.

template<class ENG>
class matrix
{
  public:
    using engine_type       = ENG;
    using element_type      = typename ENG::element_type;
    using is_resizable_type = typename ENG::is_resizable_type;
    using size_tuple        = typename ENG::size_tuple;
    using column_type       = column_vector<ENG>;
    using row_type          = row_vector<ENG>;
    using transpose_type    = matrix<matrix_transpose_engine<ENG>>;

  public:
    ~matrix() = default;
    matrix();
    matrix(matrix&&) = default;
    matrix(matrix const&) = default;

    template<class ET2>
    matrix(matrix<ET2> const& src);

    template<class ET2 = ENG, enable_if_resizable_t<ENG, ET2> = true>
    matrix(size_tuple size);
    template<class ET2 = ENG, enable_if_resizable_t<ENG, ET2> = true>
    matrix(size_t rows, size_t cols);
    template<class ET2 = ENG, enable_if_resizable_t<ENG, ET2> = true>
    matrix(size_t rows, size_t cols, size_t rowcap, size_t colcap);

    matrix&     operator =(matrix&&) = default;
    matrix&     operator =(matrix const&) = default;

    template<class ET2>
    matrix&     operator =(matrix<ET2> const& rhs);

    //- Const element access.
    //
    element_type        operator ()(size_t i, size_t j) const;
    element_type const* data() const;

    //- Accessors.
    //
    size_t      columns() const noexcept;
    size_t      rows() const noexcept;
    size_tuple  size() const noexcept;

    size_t      column_capacity() const noexcept;
    size_t      row_capacity() const noexcept;
    size_tuple  capacity() const noexcept;

    //- Useful functions.
    //
    column_type     column(size_t j) const;
    row_type        row(size_t i) const;
    transpose_type  tr() const;

    //- Mutable element access.
    //
    element_type    operator ()(size_t i, size_t j);
    element_type*   data();

    //- Change capacity.
    //
    template<class ET2 = ENG, enable_if_resizable_t<ENG, ET2> = true>
    void    reserve(size_tuple cap);
    template<class ET2 = ENG, enable_if_resizable_t<ENG, ET2> = true>
    void    reserve(size_t rowcap, size_t colcap);

    //- Change size.
    //
    template<class ET2 = ENG, enable_if_resizable_t<ENG, ET2> = true>
    void    resize(size_tuple size);
    template<class ET2 = ENG, enable_if_resizable_t<ENG, ET2> = true>
    void    resize(size_t rows, size_t cols);

    //- Change size and capacity in one shot.
    //
    template<class ET2 = ENG, enable_if_resizable_t<ENG, ET2> = true>
    void    resize(size_tuple size, size_tuple cap);
    template<class ET2 = ENG, enable_if_resizable_t<ENG, ET2> = true>
    void    resize(size_t rows, size_t cols, size_t rowcap, size_t colcap);

    //- Row and column operations.
    //
    void    swap_columns(size_t i, size_t j);
    void    swap_rows(size_t i, size_t j);

  private:
    engine_type     m_engine;   //- for exposition
};

6.3. Matrix Arithmetic Traits

This section defines sets of arithmetic traits types for negation, addition, subtraction, and multiplication. The purpose of these traits types is threefold:

  1. determine the element type of the resulting MathObj,

  2. determine the engine type of the resulting MathObj, and

  3. perform the arithmetical operation.

The idea hear is that arithmetic operators (described below) simply forward to the appropriate traits type, which does the heavy lifting.

6.3.1. Negation Traits

Class template matrix_negation_traits<E1> is an arithmetic traits type that performs a negation of a MathObj and returns the result in another MathObj having an implementation-defined engine type. There are three partial specializations to support the three overloaded unary negation operators described below.

template<class E1>                              //- for exposition only; base template
struct matrix_negation_traits                   //  not implemented
{
    using engine_type = ...;                    //- implementation-defined engine result
    using result_type = MathObj<engine_type>;   //- appropriate MathObj return type

    static result_type  negate(MathObj<E1> const& m);
};

template<class E1>
struct matrix_negation_traits<column_vector<E1>>;

template<class E1>
struct matrix_negation_traits<row_vector<E1>>;

template<class E1>
struct matrix_negation_traits<matrix<E1>>;

6.3.2. Addition Traits

Class template matrix_addition_traits<E1, E2> is an arithmetic traits type that performs an addition of two compatible MathObjs and returns the result in a MathObj having an implementation-defined engine type. There are three partial specializations to support the three overloaded binary addition operators described below.

template<class E1, class E2>                    //- for exposition only; base template
struct matrix_addition_traits                   //  not implemented
{
    using engine_type = ...;                    //- implementation-defined engine result
    using result_type = MathObj<engine_type>;   //- appropriate MathObj return type

    static result_type  add(MathObj<E1> const& m1, MathObj<E2> const& m2);
};

template<class E1, class E2>
struct matrix_addition_traits<column_vector<E1>, column_vector<E2>>;

template<class E1, class E2>
struct matrix_addition_traits<row_vector<E1>, row_vector<E2>>;

template<class E1, class E2>
struct matrix_addition_traits<matrix<E1>, matrix<E2>>;

6.3.3. Subtraction Traits

Class template matrix_subtraction_traits<E1, E2> is an arithmetic traits type that performs a subtraction of two compatible MathObjs and returns the result in a MathObj having an implementation-defined engine type. There are three partial specializations to support the three overloaded binary subtraction operators described below.

template<class E1, class E2>                    //- for exposition only; base template
struct matrix_subtraction_traits                //  not implemented
{
    using engine_type = ...;                    //- implementation-defined engine result
    using result_type = MathObj<engine_type>;   //- appropriate MathObj return type

    static result_type  subtract(MathObj<E1> const& m1, MathObj<E2> const& m2);
};

template<class E1, class E2>
struct matrix_subtraction_traits<column_vector<E1>, column_vector<E2>>;

template<class E1, class E2>
struct matrix_subtraction_traits<row_vector<E1>, row_vector<E2>>;

template<class E1, class E2>
struct matrix_subtraction_traits<matrix<E1>, matrix<E2>>;

6.3.4. Multiplication Traits

Class template matrix_multiplication_traits<E1, E2> is an arithmetic traits type that performs the multiplication of two compatible MathObjs and returns the result in a MathObj having an implementation-defined engine type. Note that there are ten partial specializations to support the thirteen overloaded binary multiplication operators described below.

template<class E1, class E2>                    //- for exposition only; base template
struct matrix_multiplication_traits             //  not implemented
{
    using engine_type = ...;                    //- implementation-defined engine result
    using result_type = MathObj<engine_type>;   //- appropriate MathObj return type

    static result_type  multiply(MathObj<E1> const& m1, MathObj<E2> const& m2);
};

template<class E1, class T2>
struct matrix_multiplication_traits<column_vector<E1>, T2>;

template<class E1, class T2>
struct matrix_multiplication_traits<row_vector<E1>, T2>;

template<class E1, class T2>
struct matrix_multiplication_traits<matrix<E1>, T2>;

template<class E1, class E2>
struct matrix_multiplication_traits<row_vector<E1>, column_vector<E2>>;

template<class E1, class E2>
struct matrix_multiplication_traits<column_vector<E1>, row_vector<E2>>;

template<class E1, class E2>
struct matrix_multiplication_traits<matrix<E1>, column_vector<E2>>;

template<class E1, class E2>
struct matrix_multiplication_traits<matrix<E1>, row_vector<E2>>;

template<class E1, class E2>
struct matrix_multiplication_traits<column_vector<E1>, matrix<E2>>;

template<class E1, class E2>
struct matrix_multiplication_traits<row_vector<E1>, matrix<E2>>;

template<class E1, class E2>
struct matrix_multiplication_traits<matrix<E1>, matrix<E2>>;

One partial specialization bears special consideration, and that is the one corresponding to the inner product. In this particular case, the return value from multiply() is not a MathObj, but instead a scalar:

template<class E1, class E2>
struct matrix_multiplication_traits<row_vector<E1>, column_vector<E2>>
{
    //- For exposition
    //
    using elem_type_1 = typename row_vector<E1>::element_type;
    using elem_type_2 = typename column_vector<E2>::element_type;
    using result_type = matrix_element_promotion_t<elem_type_1, elem_type_2>;

    static result_type  multiply(row_vector<E1> const& rv, column_vector<E1> const& cv);
};

6.4. Arithmetic Operators

Readers will note that the return types of the overloaded operators described below is left unspecified. This is a deliberate choice made so that implementers have the freedom to choose whatever default technique for evaluationg expressions they desire; for example, by returning temporary objects or perhaps by using expression templates.

6.4.1. Negation Operator

Three overloaded unary negation operators are provided to perform element-wise negation of a MathObj instance.

template<class E1>
auto operator -(column_vector<E1> const& m1);

template<class E1>
auto operator -(row_vector<E1> const& m1);

template<class E1>
auto operator -(matrix<E1> const& m1);

Note that function template operator -(MathObj<E1> const&) is equivalent to multiplying the argument by the scalar value of static_cast<E1>(-1).

6.4.2. Addition Operator

Three overloaded binary addition operators are provided to perform element-wise addition of two MathObj instances.

template<class E1, class E2>
auto operator +(column_vector<E1> const& m1, column_vector<E2> const& m2);

template<class E1, class E2>
auto operator +(row_vector<E1> const& m1, row_vector<E2> const& m2);

template<class E1, class E2>
auto operator +(matrix<E1> const& m1, matrix<E2> const& m2);

Specifically, function template operator +(MathObj<E1> const&, MathObj<E2>const&) performs the addition of two MathObj instances of identical dimension.

6.4.3. Subtraction Operator

Three overloaded binary subtraction operators are provided to perform element-wise subtraction of two MathObj instances.

template<class E1, class E2>
auto operator -(column_vector<E1> const& m1, column_vector<E2> const& m2);

template<class E1, class E2>
auto operator -(row_vector<E1> const& m1, row_vector<E2> const& m2);

template<class E1, class E2>
auto operator -(matrix<E1> const& m1, matrix<E2> const& m2);

Specifically, function template operator -(MathObj<E1> const&, MathObj<E2> const&) performs the subtraction of two MathObj instances of identical dimension.

6.4.4. Multiplication Operator

Thirteen overloaded binary multiplication operators are provided to perform appropriate multiplication of two MathObj instances.

template<class E1, class E2>
auto operator *(column_vector<E1> const& cv, E2 s);

template<class E1, class E2>
auto operator *(E1 s, column_vector<E2> const& cv);

template<class E1, class E2>
auto operator *(row_vector<E1> const& rv, E2 s);

template<class E1, class E2>
auto operator *(E1 s, row_vector<E2> const& rv);

template<class E1, class E2>
auto operator *(matrix<E1> const& m, E2 s);

template<class E1, class E2>
auto operator *(E1 s, matrix<E2> const& m);

template<class E1, class E2>
auto operator *(row_vector<E1> const& rv, column_vector<E2> const& cv);

template<class E1, class E2>
auto operator *(column_vector<E1> const& cv, row_vector<E2> const& rv);

template<class E1, class E2>
auto operator *(matrix<E1> const& m, column_vector<E2> const& cv);

template<class E1, class E2>
auto operator *(matrix<E1> const& m, row_vector<E2> const& rv);

template<class E1, class E2>
auto operator *(column_vector<E1> const& cv, matrix<E2> const& m);

template<class E1, class E2>
auto operator *(row_vector<E1> const& rv, matrix<E2> const& m);

template<class E1, class E2>
auto operator *(matrix<E1> const& m1, matrix<E2> const& m2);

The first six function templates,

perform multiplication of a MathObj instance and a scalar value.

The next two function templates,

perform the inner product and outer product, respectively. Although hidden by the auto return type, per the multiplication traits described above, the inner product returns a promoted scalar and the outer product returns matrix, as expected.

The last five function templates,

perform matrix multiplication of MathObjs.

The question may arise as to why thirteen multiplication operators are provided rather than the expected fifteen. Strictly speaking, it is certainly possible to specify multiplication operators for row_vector<E1>*row_vector<E2> and column_vector<E1>*column_vector<E2>, but these represent degenerate cases of vectors having one element, and so are not included in this revision.

6.5. Type Aliases

Some type aliases provide useful shorthand symbols for the most common compositions of types.

template<class T, class A = std::allocator<T>>
using dyn_col_vector = column_vector<dyn_matrix_engine<T, A>>;

template<class T, class A = std::allocator<T>>
using dyn_row_vector = row_vector<dyn_matrix_engine<T, A>>;

template<class T, class A = std::allocator<T>>
using dyn_matrix = matrix<dyn_matrix_engine<T, A>>;


template<class T, size_t R>
using fs_col_vector = column_vector<fs_matrix_engine<T, R, 1>>;

template<class T, size_t C>
using fs_row_vector = row_vector<fs_matrix_engine<T, 1, C>>;

template<class T, size_t R, size_t C>
using fs_matrix = matrix<fs_matrix_engine<T, R, C>>;

6.6. Expression Results

The following are some expressions that demonstrate how element and engine promotions are performed.

using cx_float  = std::complex<float>;
using cx_double = std::complex<double>;
using namespace std::la;

void t1()
{
    float       f = 1.0f;
    double      d = 1.0;
    cx_double   c = {1.0, 0.0};

    dyn_matrix<float>           dmf(3, 3);
    dyn_matrix<double>          dmd(3, 3);
    dyn_matrix<cx_double>       dmc(3, 3);

    fs_matrix<float, 3, 3>      fmf;
    fs_matrix<double, 3, 3>     fmd;
    fs_matrix<cx_double, 3, 3>  fmc;

    auto    m01 = dmf*fmf;      //- dyn_matrix<float>
    auto    m02 = dmd*fmd;      //- dyn_matrix<double>
    auto    m03 = dmc*fmc;      //- dyn_matrix<complex<double>>

    auto    m04 = fmf*dmf;      //- dyn_matrix<float>
    auto    m05 = fmd*dmd;      //- dyn_matrix<double>
    auto    m06 = fmc*dmc;      //- dyn_matrix<complex<double>>

    auto    m07 = fmf*fmd;      //- fs_matrix<double, 3, 3>
    auto    m08 = fmf*fmc;      //- fs_matrix<complex<double>, 3, 3>
}

void t10()
{
    dyn_col_vector<float>       dcvf(3);
    dyn_col_vector<double>      dcvd(3);

    fs_col_vector<float, 3>     fcvf;
    fs_col_vector<double, 3>    fcvd;

    dyn_row_vector<float>       drvf(3);
    dyn_row_vector<double>      drvd(3);

    fs_row_vector<float, 3>     frvf;
    fs_row_vector<double, 3>    frvd;

    dyn_matrix<double>          dmd(3, 3);
    dyn_matrix<float>           dmf(3, 3);
    dyn_matrix<float>           dmf_cv(3, 1);
    dyn_matrix<float>           dmf_rv(1, 3);

    fs_matrix<double, 3, 3>     fmd;
    fs_matrix<float, 3, 3>      fmf;
    fs_matrix<float, 3, 1>      fmf_cv;
    fs_matrix<float, 1, 3>      fmf_rv;

    auto    r01 = dmf * dcvf;       //- dyn_col_vector<float>
    auto    r02 = dmf_cv * drvf;    //- dyn_matrix<float>
    auto    r03 = drvf * dmf;       //- dyn_row_vector<float>
    auto    r04 = dcvf * fmf_rv;    //- dyn_matrix<float>

    auto    r11 = dmf * dcvd;       //- dyn_col_vector<double>
    auto    r12 = dmf_cv * drvd;    //- dyn_matrix<double>
    auto    r13 = drvf * dmd;       //- dyn_row_vector<double>
    auto    r14 = dcvd * dmf_rv;    //- dyn_matrix<double>

    auto    r21 = fmf * fcvf;       //- fs_col_vector<float, 3>
    auto    r22 = fmf_cv * frvf;    //- fs_matrix<float, 3, 3>
    auto    r23 = frvf * fmf;       //- fs_row_vector<float, 3>
    auto    r24 = fcvf * fmf_rv;    //- fs_matrix<float, 3, 3>

    auto    r31 = fmf * fcvd;       //- fs_col_vector<double, 3>
    auto    r32 = fmf_cv * frvd;    //- fs_matrix<double, 3, 3>
    auto    r33 = frvf * fmd;       //- fs_row_vector<3>
    auto    r34 = fcvd * fmf_rv;    //- fs_matrix<double, 3, 3>

    auto    r41 = drvf * dcvf;      //- float
    auto    r42 = frvf * dcvd;      //- double
    auto    r43 = frvd * fcvd;      //- double
}

7. Ongoing Work / Future Directions

There is clearly much work remaining just to flesh out and describe this lowest anticipated layer of linear algebra functionality. The authors intend to continue evolving this paper, working to address (at a minimum) several remaining issues:

We expect to provide (and present) a new revision of this paper (R1) in Kona that makes a start at answering these questions.