That sounds good.

I will make a concerted effort to assure the const version of types and functions are complete.

I expect that to add to the types and functions that exist, but it won’t add any new functionality.

 

 

From: SG19 <sg19-bounces@lists.isocpp.org> On Behalf Of Michael Wong via SG19
Sent: Tuesday, June 16, 2020 12:14 AM
To: sg19@lists.isocpp.org
Cc: Michael Wong <fraggamuffin@gmail.com>
Subject: Re: [SG19] SG19 June 11 monthly call

 

EXTERNAL

Meeting notes.

 

Hi all, I am sorry my power died. Phil, I heard at the end that you might still have a few things to do on the paper but it seems largely done.

I like to see if we can get that done for the next call in July, and possibly  vote on it in SG19 to pass it to LEWG.

 

I like to correct that the next call in July  will be on Stats paper review. The Aug call will be on RL and AD.

So assuming that is possible, I have adjusted the schedule but that too can be changed.

 

Thanks all.

 

 

On Wed, Jun 10, 2020 at 12:50 PM Michael Wong <fraggamuffin@gmail.com> wrote:

SG19 Machine Learning 2 hours
Hi,

Michael Wong is inviting you to a scheduled Zoom meeting.

Topic: SG19 monthly Apr 2020-Oct 2020
Time:  02:00 PM Eastern Time (US and Canada) 18:00 UTC
    Every month on the Second Thu, until Oct 8, 2020, 7 occurrence(s)
    Apr 9, 2020 02:00 PM 18:00 UTC
    May 14, 2020 02:00 PM 18:00 UTC
    Jun 11, 2020 02:00 PM 18:00 UTC
    Jul 9, 2020 02:00 PM 18:00 UTC
    Aug 13, 2020 02:00 PM 18:00 UTC
    Sep 10, 2020 02:00 PM 18:00 UTC
    Oct 8, 2020 02:00 PM 18:00 UTC
    Please download and import the following iCalendar (.ics) files to your
calendar system.
    Monthly:
https://iso.zoom.us/meeting/v50sceqopj4pyLdu5Mx1orYgnZZUj0RNqw/ics?icsToken=98tyKuuhrz0pGtyQs1-CArUqE53ibvG1kmhirrYIsQe0DDJqZQ3MDNdIYoBRAc-B

Join from PC, Mac, Linux, iOS or Android:
https://iso.zoom.us/j/291630853?pwd=WUlKbS9SNFNRa0QyWXRWenlGSDhaQT09
    Password: 339768

Or iPhone one-tap :
    US: +14086380968,,291630853# or +16468769923,,291630853#
Or Telephone:
    Dial(for higher quality, dial a number based on your current location):
        US: +1 408 638 0968 or +1 646 876 9923 or +1 669 900 6833 or +1
253 215 8782 or +1 301 715 8592 or +1 312 626 6799 or +1 346 248 7799
 or 877 853 5247 (Toll Free)
    Meeting ID: 291 630 853
    Password: 339768
    International numbers available: https://iso.zoom.us/u/abhaIjFKLZ

Or Skype for Business (Lync):
    https://iso.zoom.us/skype/291630853

Agenda:

1. Opening and introductions

1.1 Roll call of participants

Michael Wong, Richard Dosselman, Phil Ratzloff, Jorge Silva, Larry Lewis, Kevin Dewessee, Scott McMllan, ANdrew Lumsdaie,  Jesun Firoz, Marco Foco

1.2 Adopt agenda

Yes

1.3 Approve minutes from previous meeting, and approve publishing
 previously approved minutes to ISOCPP.org

1.4 Action items from previous meetings

2. Main issues (125 min)

2.1 General logistics

Meeting plan, focus on one paper per meeting but does not preclude other
paper updates:

    Apr 9, 2020 02:00 PM: stats paper- DONE
    May 14, 2020 02:00 PM: Stats paper replaces Differential calculus  DONE
    Jun 11, 2020 02:00 PM: Graph paper-
    Jul 9, 2020 02:00 PM: Stats paper + Graph paper vote
    Aug 13, 2020 02:00 PM: Differential calculus  + Reinforcement Learning
    Sep 10, 2020 02:00 PM: Graph paper +stats paper
    Oct 8, 2020 02:00 PM: Differential calculus  + Reinforcement Learning

ISO meeting status

No meeting until end of year, this year deep dive on each ML topics

Papers can still move through using the online meetings for EWG, LEWG, though there is no decision made online, just tentative decisions

CPPCON status

Will happen in hybrid form

Phil Submitted proposal for Cppcon

2.2 Paper reviews

2.2.1: ML topics

Larry Lewis Jorge Silva

Reinforcement Learning proposal:

RL within SAS

provide guidance and an API for RL, optimizers

build on top of generic machine learning

independent of vendors, but follow pytorch, tensorflow

dependent of the underlying tensor

core RL algorithms, a ton of research, it is newer then deep learning

some algos will be distributed, get feedback fro mthe group doign tensors and LA

will you need some of the underlying facilities like Supervised and unsupervised ML? Yes

will you base this on certain libraries? we are most familiar with pytorch, tensorflow when it becomes more functional, we will pick that up

will you use GPUs? yes pytorch is transparent in that respect

for deep learning, we don't use pytorch, just for RL, I don't want to be just dependent on pytorch

hard part can be design phase for ISO C++  which can take a lot more time then you think

as C++ moves, we have to adjust the design to match the new style C++20, 23, 23, we started with data structures, and switched to functions for stats

GPU is where performance come from; yes for future we will talk about parallel ranges and SYCL

 

what about automatic differentiation? for a tensor we do need both GPU and AD - this helps with back propagation

for AD: do you build a network of optimizers?  yes plan to reuse work from other teams; have implemented NN too many times

trying  hard to make AD that works for everybody? library or language

SG7 reflection had a lot of polarization on this topic, one side say it should be language, another side says it should be library; but we need code introspection, and generate from the code

we dont want to standardize entire AST of C++

 

pytorch differentiates with AD and forward differentiation meant you could not single step forward in the code, but pytorch can do that

 

 

 

Phil Ratsloff et al

P1709R1: Graph Proposal for Machine Learning

P1709R3:
https://docs.google.com/document/d/1kLHhbSTX7j0tPeTYECQFSNx3R35Mu3xO5_dyYdRy4dM/edit?usp=sharing

https://docs.google.com/document/d/1QkfDzGyfNQKs86y053M0YHOLP6frzhTJqzg1Ug_vkkE/edit?usp=sharing
<
https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.google.com%2Fdocument%2Fd%2F1QkfDzGyfNQKs86y053M0YHOLP6frzhTJqzg1Ug_vkkE%2Fedit%3Fusp%3Dsharing&data=02%7C01%7CPhil.Ratzloff%40sas.com%7C729b2cf8502641e4ae5e08d749064578%7Cb1c14d5c362545b3a430
9552373a0c2f%7C0%7C0%7C637058163592253027&sdata=4UQm8tqrcUbiZsr200UMrOaEModJYGNgP1oNot9PbAg%3D&reserved=0>

I’ve been working on the prototype implementation to get it building in both Windows & Linux, using CMake & the Conan package manager:

  1. All unit tests complete successfully for both MSVC & gcc10
  2. All bgl17 code has been removed from the repository. It uses a cloned bgl17 directory (ENABLE_BGL17 cmake option).
  3. Catch2 is now being used instead of Google Test for unit testing
  4. A simple unit test demonstrates the use of the library’s dfs_vertex_range iteration using bgl17’s vov graph. This can be seen in test/test_vov_adaptor.cpp.
    1. There were a few changes needed in bgl17 to accommodate this (I haven’t pushed these changes)

                                                               i.      I added an inner_container type definition to vov

                                                             ii.      There were 3 places where I added #ifdef _MSC_VER to disable linux-specific code, far fewer than before.

    1. Adapting vov requires the following

                                                               i.      An adaptor graph class to map the vov types to expected types

                                                             ii.      Function overloads that uses the adaptor graph class as a template argument

  1. Added graph API functions to avoid name ambiguity with begin(g) & end(g) for vertices in the dfs & bfs range iterators.
    1. vertex_begin(g), vertex_end(g)
    2. edge_begin(g,u), edge_end(g,u)

 

I haven’t written the code to support value(uv) function to get edge properties for vov yet.

These changes should bring the library much closer to a repeatable cross-platform build and you’re welcome to try it.

I’ve pushed the code to the master branch at https://github.com/pratzl/graph

 

The next SG19 meeting is 6/11/20 (12d from now) and I have some things in mind to work on. I’ve been focused on the prototype to make it more accessible for all the authors and I need to switch back to the paper and give it more attention.

  1. Paper
    1. Complete algorithm descriptions & examples:

                                                               i.      Connected Components

                                                             ii.      Strongly Connected components

                                                           iii.      Bi-connected Components

                                                           iv.      Articulation Points

    1. Data structures

                                                               i.      Add section on graph adaptors

  1. algorithm implementations
    1. connected & strongly connected components unit tests
    2. [bi-connected components]
    3. [articulation points]
  1. bgl17 adaptors
    1. vov adaptor: implement value(edge), add dfs_edge_range tests
    2. implement a compressed adaptor
  1. other prototype features
    1. Support Clang10 using the range-v3 concepts macros
  1. Documentation
    1. Add explicit description of how to install and use the library

 

90-95% done, major sections there, need examples

prototype email library works on linux and windows

using cmake conan, unit test framework

 

all algo have iterators and can also take range

output_iterator concept added requires output iterator, might be Richard want to use that

 

added vertex begin and end  to allow me to want the graph based one

 

can iterate through graph, have begin and end, with starting vertex,  can also construct one with a range

 

a section on graph data structure has been rewritten, carried from the beginning, reflect what I have in my prototype

have classes with common template types, 3 types for user values

there is an index type which is either 32 or 16 bit value

default of 32 bit is most of the case

there is the allocator

a section on what kind of user-defined type for weighs for example in an adjacency list

 

compressed has been changed to direct_adjacency_array to compliments undirected

 

can define properties for a graph and edge/vertex

only user-defined property can be changed after ts been constructed

other constraint is source edges has to be ordered by vertex key

which is DAA graph, and is a template alias with various defaults

have various classes to implement this

access id defined by public function

 

possible to customize by overriding

 

Do I need all the constant types?

Think yes, both const and non-const

also think so, we did that withBGL17 as well, else its annoying

I need to prove it for my self though I also think so

a lot of boilerplate stuff to make all this work; yes can shortcut with enable_if as well

 

one interesting constructor that takes a range of edges and vertexes, extracts key from edge from edge range, just a pair of vertex keys, another one extract fn property and fn property

now I see I also need a way to specify the graph

 

need to revisit to see if I need to reimplement this for the test

 

then there is undirected adjacency list

 

assert there is one object per edge and is part of 2 linked lists

edges are in doubly linked list, stored in a vector,  after construction can't add vertexes  or edges

inedges are ordered by vertex key

 

everythign else is similar

 

 

finally a section to adapt to external graph, adapt algo here to their data structures

we can define our own graph type to overeride the graph type to do the right thing, but also do that with  types as well

I tested my own BFS algo with BGL17 data structure; yes have small things in there that can allow that to happen starting with a graph so I think this is the right approach

 

Jesun asked Is it a hard requirement that the vertices will be in a vector and edges in a linked list for undirected graph (and probably for directed graph type)? Any implication on iterating over them in terms of performance as well as mutability of the graph?

Good question, this is something I like to explore; have a key that i want to access in a different way to enable conditional algo, where you dont have that requirement

I like to relax that area more so I have not changed any concepts from what we have before

concept used reflects that algorithm; should still have something less restrictive; yes probably right should be able to do iteration, on neighbors, randomly access  container of forward terable containers

 

 

Walking through code TDD style

as I am doing development, it is outputing results

at top of file I set the Test option using the German routes

 

 

I like what you did with BGL17 testing  ... yes it was with tuples of ranges

 

dont have topological sort done yet

 

interface is stable

strongly connected components not tested

 

what about initializers? I have a few classes that will generate graph for me but can be improved as it is not repeatable; yes we have file i/o for matrix, graph =(){} convenient thingfor testing

could I do it with what I have now or do I need a constructor but will look at whatBGL17 have too

should be doable but have to go back to see where we put that in things to do:

1. fns I have not implemented

2. BGL17 compressed graph

3. range support sentinels

4. reverse filter for a graph?

creating a NN with weighs is a common thing and eliminates an edge; yes that kind of filter is useful for NN

comparison with other libraries  - put out a separate paper

can we store a graph in some constexpr arrays

marshalling,

relax constraints on algos to make them more flexible

 

< my power died at this point>

 

 

aiming for guidance on moving this paper forward.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Richard Dosselman et al

P1708R1: Math proposal for Machine Learning
https://docs.google.com/document/d/1VAgcyvL1riMdGz7tQIT9eTtSSfV3CoCEMWKk8GvVuFY/edit

> std.org/jtc1/sc22/wg21/docs/papers/2020/p1708r2
> above is the stats paper that was reviewed in Prague
> http://wiki.edg.com/bin/view/Wg21prague/P1708R2SG19
>
> Review Jolanta Polish feedback.
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p2119r0.html

 

 

Richard persents

large number of revision now replace iterator pairs with ranges

free standing function presents linear pass over the data

for large data sets, then we have accumulator objects to make one combined pass and compute final stats at the end

alternate predicate, if you want to retrieve one value out of array of structures

 

for each mean, have overloads

MC suggested we have a weighted mean

 

Added execution policy for parallelization

gives you 4 variations of each of the means

 

we also have geometric and harmonic means with 4 flavors each

 

variance also follows but makes clear working with population vs sample

 

passing 2 ranges - 1 from value and one from range

can you pass just one  (zipping of 2 ranges together and extracting the projection) if the 2 coincide with each other

OK, I might move in that direction, will think about it

 

replace median with general quantile

 

mode has a comparator for equality

 

python only returns 1st mode, but I will return all the modes

 

makes one linear pass through each data structure,

but can also allow single pass  to compute it all

using accumulated weights which also have weighted and unweighted version of each of the mean median and mode

this allows one single linear pass for all these data structures

 

mode can return a series of values and can handle non-numerical data

 

moving to documentation now

 

for normal distribution, can you have a parameter that defaults to normal? for a statistician, Poisson distribution, arrival time,

whether a mean is a good moment to calculate, sample mean are good estimators,

continue this on reflector

 

 

 

 

 

 

 

 

 

 

 

Differentiable Programing by Marco Foco

P1416R1: SG19 - Linear Algebra for Data Science and Machine Learning
https://docs.google.com/document/d/1IKUNiUhBgRURW-UkspK7fAAyIhfXuMxjk7xKikK4Yp8/edit#heading=h.tj9hitg7dbtr

P1415: Machine Learning Layered list
https://docs.google.com/document/d/1elNFdIXWoetbxjO1OKol_Wj8fyi4Z4hogfj5tLVSj64/edit#heading=h.tj9hitg7dbtr

2.2.2 SG14 Linear Algebra progress:
Different layers of proposal
https://docs.google.com/document/d/1poXfr7mUPovJC9ZQ5SDVM_1Nb6oYAXlK_d0ljdUAtSQ/edit

2.2.3 any other proposal for reviews?

2.3 Other Papers and proposals

2.5 Future F2F meetings:

2.6 future C++ Standard meetings:
https://isocpp.org/std/meetings-and-participation/upcoming-meetings

-2020-02-10 to 15: Prague, Czech Republic

- 2020-06-01 to 06: Bulgaria
- 2020-11: (New York, tentative)
- 2021-02-22 to 27: Kona, HI, USA

3. Any other business

New reflector

http://lists.isocpp.org/mailman/listinfo.cgi/sg19

Old Reflector
https://groups.google.com/a/isocpp.org/forum/#!newtopic/sg19
<https://groups.google.com/a/isocpp.org/forum/?fromgroups=#!forum/sg14>

Code and proposal Staging area

4. Review

4.1 Review and approve resolutions and issues [e.g., changes to SG's
working draft]

4.2 Review action items (5 min)

5. Closing process

5.1 Establish next agenda

TBD

5.2 Future meeting

    Jul 9, 2020 02:00 PM
    Aug 13, 2020 02:00 PM
    Sep 10, 2020 02:00 PM
    Oct 8, 2020 02:00 PM