sg19: Re: [SG19] Oct 8 SG19 zoom call

From: Michael Wong <fraggamuffin_at_[hidden]>
Date: Thu, 8 Oct 2020 16:01:04 -0400

On Tue, Oct 6, 2020 at 5:55 PM Michael Wong <fraggamuffin_at_[hidden]> wrote:

> SG19 Machine Learning 2 hours. This session will focus on Graph paper but
> with updates from all the others optionally.
>
> Hi,
>
> Michael Wong is inviting you to a scheduled Zoom meeting.
>
> Topic: SG19 monthly Apr 2020-Oct 2020
> Time: 18:00 UTC (2:00pm Eastern Time US and Canada)
> Every month on the Second Thu, until Oct 8, 2020, 7 occurrence(s)
> Apr 9, 2020 18:00 UTC (2:00pm Eastern Time US and Canada)
> May 14, 2020 18:00 UTC (2:00pm Eastern Time US and Canada)
> Jun 11, 2020 18:00 UTC (2:00pm Eastern Time US and Canada)
> Jul 9, 2020 18:00 UTC (2:00pm Eastern Time US and Canada)
> Aug 13, 2020 18:00 UTC (2:00pm Eastern Time US and Canada)
> Sep 10, 2020 18:00 UTC (2:00pm Eastern Time US and Canada)
> Oct 8, 2020 18:00 UTC (2:00pm Eastern Time US and Canada)
> Please download and import the following iCalendar (.ics) files to your
> calendar system.
> Monthly:
>
> https://iso.zoom.us/meeting/v50sceqopj4pyLdu5Mx1orYgnZZUj0RNqw/ics?icsToken=98tyKuuhrz0pGtyQs1-CArUqE53ibvG1kmhirrYIsQe0DDJqZQ3MDNdIYoBRAc-B
>
> Join from PC, Mac, Linux, iOS or Android:
> https://iso.zoom.us/j/291630853?pwd=WUlKbS9SNFNRa0QyWXRWenlGSDhaQT09
> Password: 339768
>
> Or iPhone one-tap :
> US: +14086380968,,291630853# or +16468769923,,291630853#
> Or Telephone:
> Dial(for higher quality, dial a number based on your current location):
> US: +1 408 638 0968 or +1 646 876 9923 or +1 669 900 6833 or +1
> 253 215 8782 or +1 301 715 8592 or +1 312 626 6799 or +1 346 248 7799
> or 877 853 5247 (Toll Free)
> Meeting ID: 291 630 853
> Password: 339768
> International numbers available: https://iso.zoom.us/u/abhaIjFKLZ
>
> Or Skype for Business (Lync):
> https://iso.zoom.us/skype/291630853
>
> Agenda:
>
> 1. Opening and introductions
>
> The ISO Code of conduct:
> https://www.iso.org/files/live/sites/isoorg/files/store/en/PUB100397.pdf
> The IEC Code of Conduct:
>
> https://basecamp.iec.ch/download/iec-code-of-conduct-for-delegates-and-experts/
> The WG21 Practices and Procedures and Code of Conduct:
>
> https://isocpp.org/std/standing-documents/sd-4-wg21-practices-and-procedures
>
> 1.1 Roll call of participants
>
Phil Ratzloff
Kevin Deweese
Matthew Galati
Richard Dosselmann
Will Wray
Xu Tony Liu
Scott McMillan
Harish Naik
Michale Wong
Andrew Lumsdaine
Larry Lewis
Jens Maurer

> 1.2 Adopt agenda
>
> 1.3 Approve minutes from previous meeting, and approve publishing
> previously approved minutes to ISOCPP.org
>
> 1.4 Action items from previous meetings
>
> 2. Main issues (125 min)
>
> 2.1 General logistics
>
> Meeting plan, focus on one paper per meeting but does not preclude other
> paper updates:
>
> Apr 9, 2020 02:00 PM EDT 1800 UTC : stats paper- DONE
> May 14, 2020 02:00 PM 1800 UTC : Stats paper replaces Differential
> calculus DONE
> Jun 11, 2020 02:00 PM 1800 UTC : Graph paper- DONE
> Jul 9, 2020 02:00 PM 1800 UTC : Stats paper -DONE
> Aug 13, 2020 02:00 PM 1800 UTC : Differential calculus + Reinforcement
> Learning
> Sep 10, 2020 02:00 PM 1800 UTC : CPPCON cancellation
> Oct 8, 2020 02:00 PM 1800 UTC : Graph paper
>
> Nov 12, 2020: 1800 UTC: DST Madness cancellation and break
>
> Dec 10, 2020: 1800 UTC: stats paper
>
> Jan 14, 2021: 1800 UTC: differential calculus + reinforcement learning
>
> Feb 11, 2021: 1800 UTC: Graph paper
>
> ISO meeting status
>
> CPPCON report
>
> future C++ Std meetings
>
> 2.2 Paper reviews
>
> 2.2.1: ML topics
>
> 2.2.1.1 Stats review Richard Dosselman et al
>
> P1708R1: Math proposal for Machine Learning
>
> https://docs.google.com/document/d/1VAgcyvL1riMdGz7tQIT9eTtSSfV3CoCEMWKk8GvVuFY/edit
>
> > std.org/jtc1/sc22/wg21/docs/papers/2020/p1708r2
> > above is the stats paper that was reviewed in Prague
> > http://wiki.edg.com/bin/view/Wg21prague/P1708R2SG19
> >
> > Review Jolanta Polish feedback.
> > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p2119r0.html
>
> Unit library cppcon 2020:
>
> https://www.youtube.com/watch?v=aN6-Kz0HqRw&feature=emb_logo
>
stats proposal now have history, concept and range use, addressing feedback
from Jolant and Walter, one scan for sorted_quantiles
have sorted variants working any kind of range
unsorted works on random access range
example use structured binding
still support median of a string
different defaults for skewness
send to Jolanta, Walter email

> 2.2.1.2 Reinforcement Learning Larry Lewis Jorge Silva
>
> Reinforcement Learning proposal:
>
we need something that owns the data as opposed to mdspan, but we need
something for mdarray
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1684r0.pdf
or ndarray
https://www.youtube.com/watch?v=aHw0UxiCAFs&feature=emb_logo

> 2.2.1.3 Graph Proposal Phil Ratsloff et al
>
> P1709R1: Graph Proposal for Machine Learning
>
> P1709R3:
>
> https://docs.google.com/document/d/1kLHhbSTX7j0tPeTYECQFSNx3R35Mu3xO5_dyYdRy4dM/edit?usp=sharing
>
>
> https://docs.google.com/document/d/1QkfDzGyfNQKs86y053M0YHOLP6frzhTJqzg1Ug_vkkE/edit?usp=sharing
>
> <http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p2119r0.html>
>
changes since R2:
input from LEWG and Intel, want to be involved so changes are based on that
input
name changes are made
used to have directed adjacency array is now directed adjacency vector,
represents storage for the vector to help understand underlying storage,
array is confusing
Always use long names and no abbreviations; this makes me happy because
code is read more then is written
drop _c for concepts because it is not the naming convention, all dropped
except in one case where there is a name conflict

Graph trait introduced has type we need for the graph, depends on the needs
of the algo are and which type to be used, want to emphasize the uniform
ones to work on direct (with inward and outward type) and indirected graph
graph work on organized adjacency, no such thing as a directed adjacency
list, we need to think zen gardening, what does it do
if you have inward and outward type then you might want that distinction

in graph trait structure seems to have many redundant names, is there any
way to remove half of those typedefs and streamline with a structure that
captures hierarchy?
AL: graphs is a range of ranges so expect user to specialize the graph
trait for their own graph, so better to minimize those requirements, thus
having global template aliases or spend that much on names depends : use
some of those trait classes for iterators or ranges, rather then having to
describe them here
JM: one type you have is the range type and the rest falls out from that;
some kind of adaptation or user input may be ok
AL: inner container is at least forward range and we can have a doubly
linked list

added an edge key type based on recommendation, nothing in algo use that
though I have used it internally
Why const edge key type?
edge key type with actual value type, then there should not be const
variant; a const variant poins to something that is not modifiable

ranges has a lot more structure

on to functions:
no except is now only on the size functions
added new functions, range of vertices on a graph
can be based on vertex or vertex key
now also have vertice size function

what you get back is the inner range
what is a verex vs vertex key? Vertex range does not refer to the keys
function in the library may be expensive; as expensive as iterating through
edges, these are constant time look ups,
If I have a vector of list, it would first pull out the key, and then do a
search within the outer vector to lookup the vertex

Are these new functions useful and what should they be returning?
DO we separate vertex size function?
Vertex key gets back same vertex range type seems limiting if I want a
different data structure when I do a key lookup. What does it do? give me
adjacent vertex . So why is it a non-const reference? could be a mistake.
I was debating whether it is an integer or a string, but in vector of
lists example, when sifting you are shifting through the vector of edges,
whereas when you enumerate vertexes of the entire graph and is a different
kind of iteration
So essentially a double lookup, give me next edge of the list and also give
me the internal lookup

Now the graph data structure
No changes to the algorithms
added 4 concepts but they were duplicates and now there are 2: extract
value of the range or the property of the vertex and verify it is an input
range
but the name ranges does not exists in global namespace, Oh Ok I was using
range-v3
declare template parameter with template syntax, can omit ... so think you
want to swap the first 2 template parameter for vertex_range_extractor
move this concept into template parameter list
will process on my own

show example graph trait to demonstrate how it might be used

this is partial specialization of class template instead of overriding,
should be no partial specialized function template, but could be explicit
specialization

added section at the bottom for adapting to external graphs, need expansion
to allow other people's external graphs,

one comment: function that have to return when passing a graph and a vertex
in an Edges function, when vertex is not a member of the graph, should I
return an empty function? make it UB but it may be a vertex with no edges,
unless we say it must be valid vertex; so give rationale why UB is chosen

suggestion: remove adjacency matrix since focus is on algorithm, Intel
asked for it
should the paper only consists of algorithms, and whether we should provide
a graph data structure that can be used out of the box - one case where it
is useful is compressed sparse row matrix where inner container are all
taken from contiguous array

want a trivial vector of lists works as a graph

separate paper: the difference between adjacency thing and the data that
the graph comes from, functionality to help extract that kind of adjacency
structure and that would be useful

I understand people are uncomfortable to have a data structure with this
paper

graph have edges and vertices, adjacency structure gives a way to traverse
edges, incidence records the connection between a vertex and edge, like an
array
adjacency: vertexes in rows and columns
incidence: vertexes in rows but edges on columns, so edges will leave a 1
or -1, so when you multiple matrix times transpose, it will be adjacency
matrix; records connection a set of things and another set of things e.g.
set of actors that work together to create a movie database, index into an
inner container is not something you can index into an outer container;
recommender systems are based on connection between one set to another set
of entities

there are so many ways to create graph, so lets provide something of
general use

CSR has the possibility of gap between vertexes
data structure papers takes years because they add concerns for
performance, caching, concurrency

dont spend too much time with integration with concept paper as that is
still in flux

then it also decouples the adjacency matrix or not, wnat to kick the tire
and give me something to work with

what about a graph view, interested in doing something for that

in what way will it differ from graph data structure? allow graph subset
which is not like a view which is non-owning

need more concept

constexpr vector is another are of concern

> 2.2.1.4: Differentiable Programming by Marco Foco
>
> <
>
> https://docs.google.com/document/d/1poXfr7mUPovJC9ZQ5SDVM_1Nb6oYAXlK_d0ljdUAtSQ/edit
> >
>
> 2.2.3 any other proposal for reviews?
>
> 2.3 Other Papers and proposals
>
> P1416R1: SG19 - Linear Algebra for Data Science and Machine Learning
>
> https://docs.google.com/document/d/1IKUNiUhBgRURW-UkspK7fAAyIhfXuMxjk7xKikK4Yp8/edit#heading=h.tj9hitg7dbtr
>
> P1415: Machine Learning Layered list
>
> https://docs.google.com/document/d/1elNFdIXWoetbxjO1OKol_Wj8fyi4Z4hogfj5tLVSj64/edit#heading=h.tj9hitg7dbtr
>
> 2.2.2 SG14 Linear Algebra progress:
> Different layers of proposal
>
> https://docs.google.com/document/d/1poXfr7mUPovJC9ZQ5SDVM_1Nb6oYAXlK_d0ljdUAtSQ/edit
>
> 2.5 Future F2F meetings:
>
> 2.6 future C++ Standard meetings:
> https://isocpp.org/std/meetings-and-participation/upcoming-meetings
>
> None
>
> 3. Any other business
>
> New reflector
>
> http://lists.isocpp.org/mailman/listinfo.cgi/sg19
>
> Old Reflector
> https://groups.google.com/a/isocpp.org/forum/#!newtopic/sg19
> <https://groups.google.com/a/isocpp.org/forum/?fromgroups=#!forum/sg14>
>
> Code and proposal Staging area
>
> 4. Review
>
> 4.1 Review and approve resolutions and issues [e.g., changes to SG's
> working draft]
>
> 4.2 Review action items (5 min)
>
> 5. Closing process
>
> 5.1 Establish next agenda
>
> TBD
>
> 5.2 Future meeting
>
> Nov 12, 2020: 1800 UTC: DST Madness cancellation and break
>
> Dec 10, 2020: 1800 UTC: stats paper
>
> Jan 14, 2021: 1800 UTC: differential calculus + reinforcement learning
>
> Feb 11, 2021: 1800 UTC: Graph paper
>

Received on 2020-10-08 15:01:24