ISOCPP sg19 List: Re: BSI feedback on P1709R4 ("Graph Library")

From: Andrew Lumsdaine <al75_at_[hidden]>
Date: Tue, 9 Jan 2024 22:02:34 +0000

Hi — Thanks for your comments. My responses in-line below.

Cheers,
Andrew Lumsdaine

On Jan 9, 2024, at 9:50 AM, Oliver Rosten via SG19 <sg19_at_[hidden]> wrote:

Some other things to add:

1. The electrical circuit example needs more explanation, and I think this will highlight some deep issues around representing things which are seemingly trivially graphs, as graphs in practice. In what sense is a bog-standard resistor directed? I assume the reason that the graph is directed is because current has a sign and in an undirected graph it becomes ambiguous which way the current is flowing (also you may want components like diodes). But the directed representation also has issues: "can current flow from 'Vdd' to 'n0'?" should be immediately answerable from the properties of Vdd and its edges. There are other ways to represent an electrical circuit. One is as a directed graph but with incident edges recorded - but iiuc, this is excluded from the latest version of the paper. Alternatively, one could have a mathematical object, the name of which I actually don't know: it looks like an undirected graph, but where each partial edge has additional, unique, end-point data, as well as the common weight. Things like this are the reason why I think we need a broader group to look at this proposal (i.e. beyond SG19) and if we possibly can we should involve someone from the mathematics community. Otherwise there's a real danger we end up missing important insights.

The circuit example might not be the best one to illustrate an undirected graph. The connectivity structure is in fact undirected. When one starts assigning values to elements and computing currents and voltages, then directivity is required. What kind of graph you get also depends on what kind of formulation / analysis you want to do. And, in fact, if you are using a modified nodal analysis (which is what I had in mind), you can’t really model a circuit element (simply) as a single edge. In the case of a diode you have two properties, the conductance in the forward direction and the conductance in the reverse direction - you have either two properties on one edge or two edges in opposite directions. (If one makes the analogy between a sparse matrix and a graph, the latter is probably more correct.)

I think you may be thinking of a hypergraph (that is certainly another generalization of a graph — although hypergraphs can generally be reduced to bipartite graphs).

Our thinking for this paper was to go lighter on the mathematics and exposit at a more practical level. I have a more mathematical paper that we could perhaps provide as a complement to this one. I would need to merge some of P1709R5 back into that paper however, as well as expand a few things. It’s about ten pages at the moment. A revision would maybe grow it by a couple of pages.

As far as the definition of a graph goes, I was very careful to be quite precise, to make all terms well-defined, and to differentiate representation (which is useful - and which is used for describing and implementing algorithms) and the G={V, E} (which is less useful). The complementary paper I mentioned above has almost a page describing how imprecise even the math literature is about these things. Among other things, the literature is almost universally sloppy about graph vs graph representation and about vertex vs vertex enumeration. I deliberately used a somewhat different exposition than the one you always see just to hammer that point home.

2. My comment about the structure of the paper changing was a reference to previous comparisons with boost::graph. I'm sure these were in an earlier version, or am I misremembering? Either way, it would be very helpful to have a proper discussion of e.g. the move away from visitors.

We had decided not to include direct comparisons with BGL or GraphBLAS in the interest of space, but we have contributors to BGL and GraphBLAS as authors of this paper, so we can definitely include such comparisons.

I don’t think I am saying anything I haven’t said to the GraphBLAS community directly when I say GraphBLAS should be considered dangerous. (I have an open challenge to the community to prove me wrong. It’s been unmet for 6+ years.)

3. Re. the definition of a graph, there needs to be a proper discussion about whether the paper's definition of graph is what some authors call a multigraph and whether it does/does not include loops. These things are mentioned, in passing, when introducing algorithms, but terminology needs to be properly established.

Agreed. These are run-time properties of a graph and the preconditions for the algorithms would need to properly account for them. With our formulation being based on the adjacency matrix representation, there is kind of an assumption that these are not multigraphs. Of course, anyone can make a graph with multiple edges between the same two vertices once you have moved away from the dense adjacency matrix representation and are using an edge list or an adjacency list. But yes, this needs to be called out in the algorithm preconditions.

Alternatively (or in addition to) preconditions, some algorithms could probably still be used with multigraphs, but properties associated with edges could perhaps have some kind of property combiner (if it makes sense in the context of the algorithm and what the graph is modeling). In the case of the Kevin Bacon example, there are lots of edges between actors, one for each movie in which they have costarred. When doing breadth-first search, do we just traverse one edge and record a single property? Do we traverse all edges and record all the properties? Pick the one lexicograpically first? Etc.

Along these same lines we need to describe requirements on connectivity vis-a-vis computed results.

4. I think we're trying to do too much in one go in this paper. I think a great first step would be to build on mdspan and try to standardize (or at least understand) what might reasonably be called an unstructured span. This could be represented as a vector of vectors or as a vector with some auxiliary storage indicating where the partitions fall. The point is that an unstructured span, with the right invariants, is an adjacency list. If we can understand unstructured span and its desirable api, I think this will be incredibly valuable guidance for what a standardized graph container might look like.

I agree that unstructured mdspan would be super-useful in this context. We were somewhat reluctant to dive into that as some of the original thoughts about that (or closely related) go back to the dawn of generic programming: Matt Austern’s Segmented Iterators and Hierarchical Algorithms, Rene Rivera’s Hierarchical Data Structures and Related Concepts, and I am sure there are more. We didn’t want to have unstructured mdspan as a dependency since it could be several cycles before it might come into existence.

I think it would be quite useful to co-develop graphs and unstructured mdspan. The latter cuts across other SGs I think (ranges comes to mind — though linear algebra would obviously be interested in it for sparse matrices).

5. IIUC, this paper excludes pure connectivity graphs. These are incredibly helpful and, if I've understood correctly that they are not supported, would be a major omission. Another good reason, imo, to start with unstructured span!

Pure connectivity graphs are WIP. I had hoped to have something done by this mailing but ran out of time. Besides just the exposition of this kind of representation, we need to relax the requirements for some of the algorithms, with the current requirements either a specialization or an overload of the connectivity graphs.

6. I'm not convinced by the load api. We don't have a load api for vector etc. Moreover, would it not be preferable to have appropriate constructors?

I could probably go on but I think this is sufficient for now. Having graphs and associated algorithms in the standard would be fantastic. But I think we need to take a step back, involve people from the right communities, and figure out how to do this step-by-step.

Oliver

On Tue, 9 Jan 2024 at 17:18, Michael Wong via SG19 <sg19_at_[hidden]<mailto:sg19_at_[hidden]>> wrote:
Roger and BSI, Thanks for the great feedback. We SG19 will look at this feedback this Thursday on our regular call at 2-4 pm ET. Cheers.

On Mon, Jan 8, 2024 at 5:20 PM Roger Orr <rogero_at_[hidden]<mailto:rogero_at_[hidden]>> wrote:
Hello Michael,
as you will see from the minutes, the BSI briefly discussed P1709R4 "Graph Library" today.
You are listed as an author, so we are directing our feedback to you in the first instance!

As Oliver (Rosten) said "The basic premise is important, and it would be fantastic to have support for graphs in the standard."

The main items identified were:
Oliver:
- This paper is long and incomplete, it has lots of details which I think to be irrelevant, however things that are definitely relevant are missing from the paper - for example definition of graph - since people have different ideas. We need to add a mathematical perspective to the paper.

- The structure of the paper completely changed in the new revision, so now it’s hard to understand what and why they have done

- Another missing part is discussion of graph invariants

Tom (Deakin): There’s a big missing part in “Prior art” part, GraphBLAS (https://graphblas.org<https://urldefense.com/v3/__https://graphblas.org__;!!K-Hz7m0Vt54!lEMnRbkdjKfReFDkvhwE2zDloIG-JEOwnu8lyeH-nsfm4K0in2mS-TkcBdVd2CoVCH3WLh8DX360WQc$>) eminently.

Best wishes,
Roger.
--
SG19 mailing list
SG19_at_lists.isocpp.org<mailto:SG19_at_[hidden]>
https://lists.isocpp.org/mailman/listinfo.cgi/sg19<https://urldefense.com/v3/__https://lists.isocpp.org/mailman/listinfo.cgi/sg19__;!!K-Hz7m0Vt54!lEMnRbkdjKfReFDkvhwE2zDloIG-JEOwnu8lyeH-nsfm4K0in2mS-TkcBdVd2CoVCH3WLh8DGEl61LA$>
--
SG19 mailing list
SG19_at_[hidden]<mailto:SG19_at_[hidden]p.org>
https://urldefense.com/v3/__https://lists.isocpp.org/mailman/listinfo.cgi/sg19__;!!K-Hz7m0Vt54!lEMnRbkdjKfReFDkvhwE2zDloIG-JEOwnu8lyeH-nsfm4K0in2mS-TkcBdVd2CoVCH3WLh8DGEl61LA$

Received on 2024-01-09 22:04:16