AL Andrew Lumsdaine AP Antony Peacock BS Benjamin Saks DC Duygy Cakmak DS Domagoj Saric GD Guy Davidson GH Guilherme Hartmann LW Lukasz Wojakowski MF Marco Foco MH1 Morris Hafner MH2 Mark Hoemmer MK Matthias Kretz MV Mike Voss MW Michael Wong RK Ronan Keryell RS1 Robert Steagall RS2 Rob Simpson VC Vito Castellana VR Vincent Reverdy VV Vasil Vasilev P1708R1 presented by VR AL Clarifies about the reason for returning 2 values from Median DS Is there a way to work on blocks of data (e.g. from SIMD). Some algorithm may be optimized to work with certain size? Maybe solutions can be added later. AP I want to be able to add custom versions too, but probably executors might be used next to provide customization points. VR Executors should be compatible with this proposal DS Imagine you’re implementing your own statistical functions, and you add them to a set of accumulators fed to something that expects multiple values at the same time, how can it work? AP MKL requires aligned memory, and does dynamic dispatch DS The MKL is precompiled, I’m referring to something that is composed at compile time MW (forwards the discussion on the call, where the other author will be also present) MH2 [[[]]] VR It should be included in the accumulator set design, but there might be some customisation point in order to provide custom operations MW It seems like the main author might want to go back to the original proposal, and remove the accumulator set LW In the accumulator slide you have the three-way way to MH2 I agree the new if syntax helps you to declare this AL Why do we want to do all in one pass? Performance? VR Maybe one of the reason was that some of the algorithm might need sorting, and you might want that one time DS Another example is minmax, but when you collect several functions in an accumulator_set i don’t see how those function can know each other to cooperate VR template magic, can be used to merge, maybe DS But the mechanism need to know all the function WM Why do we need multiple passes in some cases? MH2 It’s more efficient to compute all the statistics at the same time. VR The accumulator_set can also store additional information, that can be reused. MW Is there a problem with allocating memory? VR Embedded devices, also in HPC you cannot allocate over a certain limit HM2 Are those accumulator sets allowed to have storage inside? VR (Forward the answer to the author) AP Multiple passes might be a limiting factor for some use cases VR (Explains the idea suggested by EN: the original proposed design had a pipe operator, this one doesn’t: The pipe require to execute one operation after another, while this can allow to sort the operations arbitrarily.) VR Maybe we need to verify that EN had a look at the overall design or just proposed the accumulator_set. WM He didn’t, he just proposed the accumulator_set. WM The functions are present in boost VR But designed many years ago (poll proposed: continue with accumulator set?) VR The author should investigate more the inner work of accumulator_set, and especially define how to allocate memory when necessary, maybe providing an allocator DS Issues: consider supporting fusing individual functions DS Consider support from function that can consume more than 1 value at a time, and in addition cases when the value processed are of multiple kinds AP I suppose any solution should include vectorization, differently people would go back using hand-crafted functions. MH2 I can guess why the author cut down the number in comparison to the boost, but it’s cut down to not require storage anymore. Having one statistics that require additional store would help in design. VR The benefit of this design with respect to standard algorithm: it makes more clear that an algorithm allocate or have any special requirements. Should the author continue with the functionality and direction of accumulator set or return to version 1? (SF = accumulator_set, SA = version 1) SF F N A SA 3 7 2 0 0 Attendance 17 P1709R2 Graph Data Structures, presented by AL (AL presenting the powerpoint slides) BS You’re talking about the importance of working with underlying data for the graph data structure, is the motivation primarily performance, interface? AL Interface reason, and memory occupation, performance reason MH2 You might not be able to manifest the graph as data structure, but only be able to traverse. BS More detail? MH2 It’s enormous, so you cannot fit all in memory, but you can access that AL (makes an example about chess graph, where you don’t represent the possible moves, but you can produce the connections of a graph representing the moves) BS Is there a reason not to use the original representation of the data? AL Id it could be a win to convert, in that case you can always convert, but it’s not a requirement. VR The only thing you need to run an algorithm on the graph is a random access iterator on the adjacency… to me an adjacency list isn’t a vector, it’s a representation of the graph, how does the statement “there is no graph” relates to this AL Some algorithms has different requirements. However, yes, “there is no graph” is an overstatement (presents the working of the example “Generic triangle counting” as an example). The important here is that we’re presenting concepts for graphs rather than specific implementations. (AL presenting the paper, show declaration on pages 4 and then moves to examples on depth_first_search on page 12) MH2 If you have breadth_first_search you need storage? AL (presents the declaration of breadth_first_search on page 14) AL Asks feedback on the approach: AL 1) is this a useful point of view for conceiving a graph library MW This is an improvement, there’s more algorithm than data structure with respect to previous version AL (also points out at the concepts) AL As for the data structures, it’s useful to have some reference implementations VR One problem you mention in BGL was property maps… AL BGL wanted to be generic, we had the multiple options, we wanted to have a single syntax to access data with all possible underlying representation: it was complicated and caused a lot of troubles. This approach is a step back from properties, leaving the user to access properties in their specific way. BS We’ve heard questions from other proposals about SIMD. I’m wondering if people in the room that knows about SIMD are interested about commenting? VR I also implemented my own trees, and had the problem of property amps. You want to decouple as much as possible the algorithm from your data. BS Not being able to use SIMD seems like potential pitfalls HM2 foreach in simd is hard, this is harder than foreach. DS First of all, interesting approach. About vectorisation, I’m not sure if it comes into the picture at all: traversal has nothing to do with traversal, it’s more what you do with the nodes. AL The algorithms doesn’t need to pull of one element at a time. DS How easily does this give the opportunity to traverse the graph in parallel, for example with an executor. DS I see potential problems in making this graph constexpr, and do compile-time analysis of graphs. I would be happy to have solution for both compile-time and run-time graphs (Exemplifies with Neural Network) GH This is for generic graphs, neural networks are DAGs. DS I’m not asking to be only compile time, but to allow for compile time and reuse parts. MW Invite DS to share examples with AL AP Returning to the SIMD matter: it’s not obvious how to extend to SIMD, I don’t know how you can apply this in a generic fashion. Parallelism is more important. AP Proposes TBB MH2 If the hardware is designed for HW context switching you don’t want SW switching like TBB. VR This graph and call graph are two different things. AL Is there a canonical set of building blocks that we want to have? MW I don’t have any canonical set in mind. MH2 Would you consider approximation algorithm? What you consider BB? AL Boost Graph has 40-50 algorithms, do we want all those in the standard algorithm, we need direction to decide what to keep and what to leave out. AL Sparse matrices are related to graphs, is SG14 and SG19 looking for Sparse matrices? GD We have no reason to exclude the possibility, but we are not proposing MH2 There’re potential pitfalls: a graph might be expressed as matrix and vector, but a sparse matrix and a sparse vector VR There’s not much general consensus on how to implement sparse matrices, focus on dense matrices. RS2 Agrees with GD MH2 Efficient sparse matrix multiplication is tricky. DS (Requests a poll on constexpr) RK Constexpr version of this is fascinating, I work in reconfigurable computing FPGA and reconfigurable cores, I could do place&route of tasks at compile time for my program. DS not the whole library must be constexpr-ified, a subset would suffice. Poll: Supporting free functions algorithms instead of class-based design? SF F N A SA 10 6 0 0 0 19 present Poll: Consider supporting compile-time for static graph SF F N A SA 6 8 5 0 0 19 present Automatic Differentiation in C++ (V. Vassilev, M. Foco)