sg19: Re: [SG19] SG19 Dec 12 monthly call

From: Michael Wong <fraggamuffin_at_[hidden]>
Date: Thu, 12 Dec 2019 14:51:26 -0500

On Tue, Dec 10, 2019 at 9:21 PM Michael Wong <fraggamuffin_at_[hidden]> wrote:

> SG19 Machine Learning 2 hours
>
> Hi,
>
> Michael Wong is inviting you to a scheduled Zoom meeting.
>
> Topic: SG19 monthly Dec 2019-Mar 2020
> Time: Dec 12, 2019 01:00 PM Eastern Time (US and Canada)
> Every month on the Second Thu, until Mar 12, 2020, 4 occurrence(s)
> Dec 12, 2019 01:00 PM
> Jan 9, 2020 01:00 PM
> Feb 13, 2020 01:00 PM
> Mar 12, 2020 01:00 PM
> Please download and import the following iCalendar (.ics) files to
> your calendar system.
> Monthly:
> https://iso.zoom.us/meeting/u5IudOyrrD0o6QfidlHRJTh2cH9H6gRuPA/ics?icsToken=98tyKu-urTgvGdaTslyCe7MtOZX_bN-xjSd_rKpZkDfXKRRbMADeb8oUNYBqIPmB
>
> Join from PC, Mac, Linux, iOS or Android: https://iso.zoom.us/j/663353262
>
> Or iPhone one-tap :
> US: +14086380968,,663353262# or +16468769923,,663353262#
> Or Telephone:
> Dial(for higher quality, dial a number based on your current
> location):
> US: +1 408 638 0968 or +1 646 876 9923 or +1 669 900 6833 or
> 877 369 0926 (Toll Free) or 877 853 5247 (Toll Free)
> Meeting ID: 663 353 262
> International numbers available: https://iso.zoom.us/u/abhaIjFKLZ
>
> Or Skype for Business (Lync):
> https://iso.zoom.us/skype/663353262
>
>
> Agenda:
>
> 1. Opening and introductions
>
> 1.1 Roll call of participants
>
Michael Wong, Phil Ratzloff, Andrew Lumsdaine, Duygu Cakmak, Matthew
galati, Richard Dosselmann, Lukasz Wojakowski, Jesun Firoz, Scott McMillan,
DOmagoj Saric

> 1.2 Adopt agenda
>
Yes

> 1.3 Approve minutes from previous meeting, and approve publishing
> previously approved minutes to ISOCPP.org
>
> 1.4 Action items from previous meetings
>
> 2. Main issues (125 min)
>
> 2.1 General logistics
> All C++ reflector are now moved to listserv
>
> https://lists.isocpp.org/mailman/listinfo.cgi/sg19
>
C++20 progress

>
> 2.2 Paper reviews
>
> Review Belfast results.
>
> 2.2.1: ML topics
>
> Richard Dosselman
>
> P1708R1: Math proposal for Machine Learning
>
> https://docs.google.com/document/d/1VAgcyvL1riMdGz7tQIT9eTtSSfV3CoCEMWKk8GvVuFY/edit
>
Now in latex
for these simple stat functions, should they be stand alone (simpler to
implement as well if you want to add new functions, but now you have
multiple scan of data if you wnat to do all the algorithms), or aggregate
those into a class type structure ( can do a single scan for multiple
algorithm similar to boost accumulate)
Vote results from belfast:
Should the author continue with the functionality and direction of
accumulator set or return to version 1?
(SF = accumulator_set, SA = version 1)

SF F N A SA
3 7 2 0 0
Favor accumulator set

But how do we add new functions and extend but that adds accessing the guts

Mode require passing a range and fill it up for all modes,
concern about havign multiple modes in the set of data,
Jens did not like returing a vector as it allocates memory, most STL pass
their own preallocated data structure

BFS and DFS traversal requires something allocated in the back, so no
allocation may be too hard a restriction
Can still extend for images or a set of images for facial recognition for
de-nosing
also Belfast asked for support for complex numbers

>
> P1709R1: Graph Proposal for Machine Learning
>
>
> https://docs.google.com/document/d/1QkfDzGyfNQKs86y053M0YHOLP6frzhTJqzg1Ug_vkkE/edit?usp=sharing
> <
> https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.google.com%2Fdocument%2Fd%2F1QkfDzGyfNQKs86y053M0YHOLP6frzhTJqzg1Ug_vkkE%2Fedit%3Fusp%3Dsharing&data=02%7C01%7CPhil.Ratzloff%40sas.com%7C729b2cf8502641e4ae5e08d749064578%7Cb1c14d5c362545b3a4309552373a0c2f%7C0%7C0%7C637058163592253027&sdata=4UQm8tqrcUbiZsr200UMrOaEModJYGNgP1oNot9PbAg%3D&reserved=0>
>
>
reviewing Belfast feedback
this algorithmic centric approach was good to continue, and in conjunction
with range based interpretation of graphs, and also doign it at compile
time with constexpr
and finally also address parallel executon, also question about using simd

ok constexpr was an oversight, so updated the read-only functions, unclear
whether we should change others
parallelism should be per- algorithm ? Yes seems reasonable, parallel graph
algo are not well understood, the best parallel graph algo is not just a
parallelized version of sequential, so I dont think that is a sensible algo
to have parallel DFS by just adding an execution policy, these things dont
parallelize, BFS also have synchronization issue; some have a priority
queue and requires single stepping

positive from Belfast
AL promotes a graph as range of ranges
BGL17 is a Better Graph Library using C++17
MS VS has range V3 and COncepts

So wanted to try it out with an example of adjacency array,
so look at how the graph is built, the ranges, and the implementation of
BFS and DFS
using the German route from the paper as starting point

vector of cities and edges
3 landas for cities and edge value first and seocnd

now we have pritning the structure
iterating through vertices, gettign key, outgoing edges, explciit for now
but could change to auto

BGL17 has bfs and dfs based on both nodes and vertexes

now using an iterator based on approach be cause it has depth
if not care about depth then use the range based which is second solution

the boolean is it always recorded? Yes always recorded
can we overrde the implementation? BGL17 philosophy was trying to get as
far away from original BGL
can put al lvisitor code in the body of the range based adaptor; I used BGL
and I thought that was useful
I was doign an experiment to see if this works to solve both flexibility
and performance as vsiitor can take a hit with performance

we could create a visitor that visits both edge and vertices, but it would
be more complcated to write; so MG if you can think of an example where
this does not work that would be great? I cant think of how such an eaxmple
would look like.

Centrality would be a great next algorithm,

the final one is BFS edge

now look at BFS/DFS implementation

One interestign observation is that range itself contains the content ot
state of the iteration, so as you are iteratign through so if you call
begin, it returns location of where you iterated to; this allows itrator to
be copied in a light weigh way

so need a const begin ...

created a concept called SearchableGrpahConcept
has a forward range,
to get begin and end of edges, we need to pass in the graph for context
this i causing us to have free function

so in bfs and dfs we dont have to worry whether it is directed, we just
wnat teh range of graphs to default to a set of vertices
but if you want begin of graph it returns the begin of the vertex collection
if you want begin of vertex it is begin of outedge

the ago is based on having an integral vertex key, so if we dont have an
integral vertex key, these algo wont work
yes, we wanted to have this key type that we can dereference n both the
inner and outer range
so maximal set only works on an edge list

with algorithmic centric approach, different ones will have different
requirements, we might be able to boil it down to a small list of
requirements

so enfornce to start an edge list, or user gives you an adjacency list
directly for the definition of the graph

in general, different algo knows what requirement they have, cocept define
formally what those algorithms are

we can also have nested concepts,

if it supports both the nwill have to do somethign extra to select. This
needs experiment in future

if we dont enable that design then we are lost, as we are trying to follow
STL separation of algo and containers, he intent here is that we have free
functions and we can create a single structure with the types

the whole properties might have issue when properties is mutable

data members are graph stack, vector bool
this boils down to how this is constructed

does vertex return the from or to? the to for directed graph
grabbing the key could be a hit for performance, if we know we have
integers that have no need for hasing with key labels, why are we dealing
with it
define concept for speed version
also about code size
OK so we have a template parameter in the definition of the graph? Right
and if we check if it is consecutive, the nskip key lookup, as these can
make a big difference

adjacency array with edges stored consecutively a vector or DAG, real impl
is kept in impl file along with free functions that tie this to adjanceny
array

can have this directed on any type or any structure

so we need to pass allocator since we cant add anything

cant prevent user from using the free functions, may be consequence so need
education

dont look directly to see a graph,

no public funcions for removign edges so its semi-mutable

constructor have 2 template statements, one for the graph and the other for
the constructor itself

static usage of grapph requires separet th container from algorithm so I
can define a struct of my own that obeys an interface?
then we can statically allocate the value then we wont have to wait for
constexpr STL

perhaps library can have a tool that generate static graph from dynamic
ones, like serialize an unordered map translates a dynamic TF graph to a
static C++ graph

XLA and glow will analyze a graph structure in any format like onyx nnef,
and do fusioin transformation, reordering, and generate machine code , so I
am lookign at this proposal to generate reusable tool for these graph

next iteration of paper is more examples, and more functions, and
implementing the algo

suggest we also have an undirected graph and tha twould help. AL mentions
other ones in BGL17 would also help

> Differentiable Programing by Marco Foco
>
Workign on it for next call

> P1416R1: SG19 - Linear Algebra for Data Science and Machine Learning
>
> https://docs.google.com/document/d/1IKUNiUhBgRURW-UkspK7fAAyIhfXuMxjk7xKikK4Yp8/edit#heading=h.tj9hitg7dbtr
>
> P1415: Machine Learning Layered list
>
> https://docs.google.com/document/d/1elNFdIXWoetbxjO1OKol_Wj8fyi4Z4hogfj5tLVSj64/edit#heading=h.tj9hitg7dbtr
>
> 2.2.2 SG14 Linear Algebra progress:
> Different layers of proposal
>
> https://docs.google.com/document/d/1poXfr7mUPovJC9ZQ5SDVM_1Nb6oYAXlK_d0ljdUAtSQ/edit
>
> 2.2.3 any other proposal for reviews?
>
> 2.3 Other Papers and proposals
>
> 2.5 Future F2F meetings:
>
> 2.6 future C++ Standard meetings:
> https://isocpp.org/std/meetings-and-participation/upcoming-meetings
>
>
> -2020-02-10 to 15: Prague, Czech Republic
>
> - 2020-06-01 to 06: Bulgaria
> - 2020-11-6-14: (New York, tentative)
> - 2021-02-22 to 27: Kona, HI, USA
>
> 3. Any other business
>
> New reflector
>
> http://lists.isocpp.org/mailman/listinfo.cgi/sg19
>
> Old Reflector
> https://groups.google.com/a/isocpp.org/forum/#!newtopic/sg19
> <https://groups.google.com/a/isocpp.org/forum/?fromgroups=#!forum/sg14>
>
> Code and proposal Staging area
>
> 4. Review
>
> 4.1 Review and approve resolutions and issues [e.g., changes to SG's
> working draft]
>
> 4.2 Review action items (5 min)
>
> 5. Closing process
>
> 5.1 Establish next agenda
>
> TBD
>
> 5.2 Future meeting
>
> Dec 12, 2019 01:00 PM
> Jan 9, 2020 01:00 PM: Jan 13 is mailing deadline
> Feb 13, 2020 01:00 PM
> Mar 12, 2020 01:00 PM
>

Received on 2019-12-12 13:54:06