ISOCPP sg19 List: Re: BSI feedback on P1709R4 ("Graph Library")

From: Phil Ratzloff <Phil.Ratzloff_at_[hidden]>
Date: Wed, 21 Feb 2024 16:26:00 +0000

There has been some internal discussions from this and want to share so everyone is aware of where we’re heading.

The use of MDSpan, or irregular MDSpan, for adjacency list implies that vertices are in a random-access container. We would like to be able to accommodate sparse vertex ids in the future, implying the use of a bi-directional container like std::map, so using a form of MDSpan wouldn’t allow for that. However, the existing design should easily MDSpan as the underlying container. This is something we need to assure, as we make the use of std::vector and std::deque easy to use.

There are questions about how the proposal compares to boost::graph. Andrew will be submitting a new paper that makes that comparison. The goal is to have it in the next mailing.

Another concern is that the DFS and BFS functionality isn’t flexible enough, especially when compared to boost::graph’s visitors. I’ll be working on that design and hope to have an idea of what it will be before the Tokyo meeting, though I can’t commit to when it will be in P3128 (the new Graph Algorithms paper).

Andrew has said he’ll be revising the Circuit Design example mentioned also, as well as making other improvements in the Background and Terminology paper, now in P3127.

If there are other major areas that need attention please let us know.

From: SG19 <sg19-bounces_at_[hidden]> On Behalf Of Oliver Rosten via SG19
Sent: Thursday, January 11, 2024 5:17 PM
To: Andrew Lumsdaine <al75_at_[hidden]>
Cc: Oliver Rosten <oliver.rosten_at_[hidden]>; sg19_at_[hidden]; lebrungrandt_at_[hidden]; Roger Orr <rogero_at_[hidden]>; me_at_[hidden]; Trott, Christian Robert <crtrott_at_[hidden]>
Subject: Re: [SG19] BSI feedback on P1709R4 ("Graph Library")

EXTERNAL
Sorry shared edges --> shared edge weights

I did experiment with shared edges at some point but abandoned it as a bad idea :)

On Thu, 11 Jan 2024 at 22:16, Oliver Rosten <oliver.rosten_at_[hidden]<mailto:oliver.rosten_at_[hidden]>> wrote:
Hi Andrew,

The use case has to do with undirected edges with a mutable property. With adjacency-based graph representations, an undirected edge gets split into two directed edges (since the representation is of adjacencies — partial edges, essentially). If there are edges with properties, those are shared, and then how to handle that becomes interesting — and leads (iiuc) to your diagram.

That's not quite what I'm getting at (at least if I've understood you correctly). The end-point data, 'a' and 'b' have nothing to do with the handling of shared edges. For that I've implemented a couple of strategies:

1. Literally share the weight, using a shared pointer.
2. Store it separately for each edge. For mutations, employ a sentinel which deals with the case of an exception being thrown when one weight has been changed, but not the other.

There are a couple of ways to think about 'a' and 'b'. One would be to say that the edge weights are actually completely independent, say W and W'. Alternatively, you say that there is one component of the edge weight that is always shared - call it W - and a freedom to differ held by 'a' and 'b'. You can use 'a' and 'b' to establish an orientation for the edges. A bit like saying you've drawn a mark on your resistor, indicating the back and front for the purposes of establishing a positive current convention. (With W holding the value of the resistance.)

Note that you could choose 'a' and 'b' such that you could interpret your graph as a directed graph with incident edges recorded. But I think the structure is more flexible than this.

Oliver

PS My history includez implementing a 1D fluid flow network, which is of course very similar to electrical circuits. (I actually simulated some of the latter as part of my testing, since the code is generic.) I then reimplemented all of this and much more besides in a library of mine up on github. In terms of the algorithms available, it's nothing like as mature as BGL. But you can do some cool things like topological sorting at compile time.

On Thu, 11 Jan 2024 at 21:51, Andrew Lumsdaine <al75_at_[hidden]<mailto:al75_at_[hidden]>> wrote:
Oh — hah — yes. Very familiar with that case. At least the way we have seen it come up is more in the context of implementation than as an abstraction per se (though there of course may be a corresponding abstraction). The use case has to do with undirected edges with a mutable property. With adjacency-based graph representations, an undirected edge gets split into two directed edges (since the representation is of adjacencies — partial edges, essentially). If there are edges with properties, those are shared, and then how to handle that becomes interesting — and leads (iiuc) to your diagram.

I am also looking for names for certain other things like the thing we are calling a “structurally bipartite graph” (or more informally, a “rectangular graph”).

I did approach a mathematician about some of these things before the holidays. He initially wanted to express it all in terms of category theory. It wasn’t quite what I was hoping for.

(The reason that hypergraphs came to mind is that hypergraphs have generalized edges — i.e., the “edges” of the graph are arbitrary subsets of vertices, rather than just being subsets of size 2. That includes subsets of size 1. If one tries to interpret that in the usual sense of a graph edge as having a source and a target — it’s an edge with a source but no target — an edge to nowhere :-)).

Best Regards,
Andrew Lumsdaine

PS — Interestingly, the genesis of my work in graphs came from circuit simulation (way back). My first task as a new PhD student was to implement RCM for the sparse solver in a circuit simulator. Reimplementing it 12 years later led to BGL.

On Jan 9, 2024, at 11:25 PM, Oliver Rosten <oliver.rosten_at_[hidden]<mailto:oliver.rosten_at_[hidden]>> wrote:

This Message Is From an Untrusted Sender
You have not previously corresponded with this sender.
See https://itconnect.uw.edu/email-tags for additional information. Please contact the UW-IT Service Center, help_at_[hidden]<mailto:help_at_[hidden]> 206.221.5000, for assistance.
Hi Andrew,

Thanks for such a detailed reply. Re. the circuit diagram, it's not a hypergraph I'm thinking of. It's a structure like this:

a---W--->
<---W---b

Every edge comes as a pair of partial edges. They have a shared weight, W, as you'd expect from an undirected graph. But they also have independent end-point data, labelled a, b). This is sufficient to properly model things like electrical circuits. So this is a directed graph but with a particular restriction that all edges come in pairs, but holding data such that I don't think it's right to call it undirected (but it almost is!). I don't know the right name for this structure, but I suspect a mathematician would! This is part of my broader point: I really think we would benefit from finding a willing participant from the mathematics community to keep us on the straight and narrow!

I completely take your point about the origin of the STL. I certainly don't desire an explicit coupling between something like unstructured span and the graph library. However, I also have little doubt that knowing the shape of unstructured span would greatly help the design of graph. It think it's reasonable for development of these two things to happen concurrently. Indeed, I've cc'd the first 4 authors of the mdspan paper (apologies if I've not carried on far enough down the author list!).

Re. merging information from your two papers, I think this might help if done judiciously. I find P1709 very hard to follow, despite having spent a lot of time working with graphs!

Oliver

On Tue, 9 Jan 2024 at 22:04, Andrew Lumsdaine <al75_at_[hidden]<mailto:al75_at_[hidden]>> wrote:
Hi — Thanks for your comments. My responses in-line below.

Cheers,
Andrew Lumsdaine

On Jan 9, 2024, at 9:50 AM, Oliver Rosten via SG19 <sg19_at_[hidden]<mailto:sg19_at_[hidden]>> wrote:

Some other things to add:

1. The electrical circuit example needs more explanation, and I think this will highlight some deep issues around representing things which are seemingly trivially graphs, as graphs in practice. In what sense is a bog-standard resistor directed? I assume the reason that the graph is directed is because current has a sign and in an undirected graph it becomes ambiguous which way the current is flowing (also you may want components like diodes). But the directed representation also has issues: "can current flow from 'Vdd' to 'n0'?" should be immediately answerable from the properties of Vdd and its edges. There are other ways to represent an electrical circuit. One is as a directed graph but with incident edges recorded - but iiuc, this is excluded from the latest version of the paper. Alternatively, one could have a mathematical object, the name of which I actually don't know: it looks like an undirected graph, but where each partial edge has additional, unique, end-point data, as well as the common weight. Things like this are the reason why I think we need a broader group to look at this proposal (i.e. beyond SG19) and if we possibly can we should involve someone from the mathematics community. Otherwise there's a real danger we end up missing important insights.

The circuit example might not be the best one to illustrate an undirected graph. The connectivity structure is in fact undirected. When one starts assigning values to elements and computing currents and voltages, then directivity is required. What kind of graph you get also depends on what kind of formulation / analysis you want to do. And, in fact, if you are using a modified nodal analysis (which is what I had in mind), you can’t really model a circuit element (simply) as a single edge. In the case of a diode you have two properties, the conductance in the forward direction and the conductance in the reverse direction - you have either two properties on one edge or two edges in opposite directions. (If one makes the analogy between a sparse matrix and a graph, the latter is probably more correct.)

I think you may be thinking of a hypergraph (that is certainly another generalization of a graph — although hypergraphs can generally be reduced to bipartite graphs).

Our thinking for this paper was to go lighter on the mathematics and exposit at a more practical level. I have a more mathematical paper that we could perhaps provide as a complement to this one. I would need to merge some of P1709R5 back into that paper however, as well as expand a few things. It’s about ten pages at the moment. A revision would maybe grow it by a couple of pages.

As far as the definition of a graph goes, I was very careful to be quite precise, to make all terms well-defined, and to differentiate representation (which is useful - and which is used for describing and implementing algorithms) and the G={V, E} (which is less useful). The complementary paper I mentioned above has almost a page describing how imprecise even the math literature is about these things. Among other things, the literature is almost universally sloppy about graph vs graph representation and about vertex vs vertex enumeration. I deliberately used a somewhat different exposition than the one you always see just to hammer that point home.

2. My comment about the structure of the paper changing was a reference to previous comparisons with boost::graph. I'm sure these were in an earlier version, or am I misremembering? Either way, it would be very helpful to have a proper discussion of e.g. the move away from visitors.

We had decided not to include direct comparisons with BGL or GraphBLAS in the interest of space, but we have contributors to BGL and GraphBLAS as authors of this paper, so we can definitely include such comparisons.

I don’t think I am saying anything I haven’t said to the GraphBLAS community directly when I say GraphBLAS should be considered dangerous. (I have an open challenge to the community to prove me wrong. It’s been unmet for 6+ years.)

3. Re. the definition of a graph, there needs to be a proper discussion about whether the paper's definition of graph is what some authors call a multigraph and whether it does/does not include loops. These things are mentioned, in passing, when introducing algorithms, but terminology needs to be properly established.

Agreed. These are run-time properties of a graph and the preconditions for the algorithms would need to properly account for them. With our formulation being based on the adjacency matrix representation, there iskind of an assumption that these are not multigraphs. Of course, anyone can make a graph with multiple edges between the same two vertices once you have moved away from the dense adjacency matrix representation and are using an edge list or an adjacency list. But yes, this needs to be called out in the algorithm preconditions.

Alternatively (or in addition to) preconditions, some algorithms could probably still be used with multigraphs, but properties associated with edges could perhaps have some kind of property combiner (if it makes sense in the context of the algorithm and what the graph is modeling). In the case of the Kevin Bacon example, there are lots of edges between actors, one for each movie in which they have costarred. When doing breadth-first search, do we just traverse one edge and record a single property? Do we traverse all edges and record all the properties? Pick the one lexicograpically first? Etc.

Along these same lines we need to describe requirements on connectivity vis-a-vis computed results.

4. I think we're trying to do too much in one go in this paper. I think a great first step would be to build on mdspan and try to standardize (or at least understand) what might reasonably be called an unstructured span. This could be represented as a vector of vectors or as a vector with some auxiliary storage indicating where the partitions fall. The point is that an unstructured span, with the right invariants, is an adjacency list. If we can understand unstructured span and its desirable api, I think this will be incredibly valuable guidance for what a standardized graph container might look like.

I agree that unstructured mdspan would be super-useful in this context. We were somewhat reluctant to dive into that as some of the original thoughts about that (or closely related) go back to the dawn of generic programming: Matt Austern’s Segmented Iterators and Hierarchical Algorithms, Rene Rivera’s Hierarchical Data Structures and Related Concepts, and I am sure there are more. We didn’t want to have unstructured mdspan as a dependency since it could be several cycles before it might come into existence.

I think it would be quite useful to co-develop graphs and unstructured mdspan. The latter cuts across other SGs I think (ranges comes to mind — though linear algebra would obviously be interested in it for sparse matrices).

5. IIUC, this paper excludes pure connectivity graphs. These are incredibly helpful and, if I've understood correctly that they are not supported, would be a major omission. Another good reason, imo, to start with unstructured span!

Pure connectivity graphs are WIP. I had hoped to have something done by this mailing but ran out of time. Besides just the exposition of this kind of representation, we need to relax the requirements for some of the algorithms, with the current requirements either a specialization or an overload of the connectivity graphs.

6. I'm not convinced by the load api. We don't have a load api for vector etc. Moreover, would it not be preferable to have appropriate constructors?

I could probably go on but I think this is sufficient for now. Having graphs and associated algorithms in the standard would be fantastic. But I think we need to take a step back, involve people from the right communities, and figure out how to do this step-by-step.

Oliver

On Tue, 9 Jan 2024 at 17:18, Michael Wong via SG19 <sg19_at_[hidden]<mailto:sg19_at_[hidden]>> wrote:
Roger and BSI, Thanks for the great feedback. We SG19 will look at this feedback this Thursday on our regular call at 2-4 pm ET. Cheers.

On Mon, Jan 8, 2024 at 5:20 PM Roger Orr <rogero_at_[hidden]<mailto:rogero_at_[hidden]>> wrote:
Hello Michael,
as you will see from the minutes, the BSI briefly discussed P1709R4 "Graph Library" today.
You are listed as an author, so we are directing our feedback to you in the first instance!

As Oliver (Rosten) said "The basic premise is important, and it would be fantastic to have support for graphs in the standard."

The main items identified were:
Oliver:
- This paper is long and incomplete, it has lots of details which I think to be irrelevant, however things that are definitely relevant are missing from the paper - for example definition of graph - since people have different ideas. We need to add a mathematical perspective to the paper.

- The structure of the paper completely changed in the new revision, so now it’s hard to understand what and why they have done

- Another missing part is discussion of graph invariants

Tom (Deakin): There’s a big missing part in “Prior art” part, GraphBLAS (https://graphblas.org<https://urldefense.com/v3/__https://graphblas.org__;!!K-Hz7m0Vt54!lEMnRbkdjKfReFDkvhwE2zDloIG-JEOwnu8lyeH-nsfm4K0in2mS-TkcBdVd2CoVCH3WLh8DX360WQc$>) eminently.

Best wishes,
Roger.
--
SG19 mailing list
SG19_at_[hidden]<mailto:SG19_at_[hidden]>
https://lists.isocpp.org/mailman/listinfo.cgi/sg19<https://urldefense.com/v3/__https://lists.isocpp.org/mailman/listinfo.cgi/sg19__;!!K-Hz7m0Vt54!lEMnRbkdjKfReFDkvhwE2zDloIG-JEOwnu8lyeH-nsfm4K0in2mS-TkcBdVd2CoVCH3WLh8DGEl61LA$>
--
SG19 mailing list
SG19_at_lists.isocpp.org<mailto:SG19_at_[hidden]>
https://urldefense.com/v3/__https://lists.isocpp.org/mailman/listinfo.cgi/sg19__;!!K-Hz7m0Vt54!lEMnRbkdjKfReFDkvhwE2zDloIG-JEOwnu8lyeH-nsfm4K0in2mS-TkcBdVd2CoVCH3WLh8DGEl61LA$

Received on 2024-02-21 16:26:03