sg19: Re: [SG19] May 9 (Thursday tomorrow) SG19 Zoom Meeting

From: Michael Wong <fraggamuffin_at_[hidden]>
Date: Thu, 9 May 2019 14:54:04 -0400

On Wed, May 8, 2019 at 10:19 AM Michael Wong <fraggamuffin_at_[hidden]> wrote:

> SG19 Machine Learning 2 hours
> Topic: ISOCPP SG19 Machine Learning
> Time: May 9, 2019 1:00 PM Eastern Time (US and Canada)
> Every month on the Second Thu, until Sep 12, 2019, 5 occurrence(s)
> May 9, 2019 1:00 PM
> Jun 13, 2019 1:00 PM
> Jul 11, 2019 1:00 PM
> Aug 8, 2019 1:00 PM
> Sep 12, 2019 1:00 PM
> Please download and import the following iCalendar (.ics) files to
> your calendar system.
> Monthly:
> https://iso.zoom.us/meeting/405838761/ics?icsToken=fe375cbeab0268c23180f074fc6795e459a2bcd0fd6bdf9a8e146fa2acd22ef0
>
> Join from PC, Mac, Linux, iOS or Android: https://iso.zoom.us/j/405838761
>
> Or iPhone one-tap :
> US: +14086380968,,405838761# or +16468769923,,405838761#
> Or Telephone:
> Dial(for higher quality, dial a number based on your current
> location):
> US: +1 408 638 0968 or +1 646 876 9923 or +1 669 900 6833 or
> 877 853 5247 (Toll Free) or 877 369 0926 (Toll Free)
> Meeting ID: 405 838 761
> International numbers available: https://zoom.us/u/abhaIjFKLZ
>
> Or Skype for Business (Lync):
> https://iso.zoom.us/skype/405838761
> <https://iso.zoom.us/skype/405838761>
>
>
>
>
>
>
> Agenda:
>
> 1. Opening and introductions
>
1.1 Roll call of participants
>
Frank Seide, Phil Ratzloff, David Gilles, Kirsten Lee, Marco Foco, Richard
Dosselmann, Ronan Keryell, Sebastian Messmer, Michael Wong, Vincent Reverdy

>
> 1.2 Adopt agenda
>
Approve

>
> 1.3 Approve minutes from previous meeting, and approve publishing
> previously approved minutes to ISOCPP.org
>
Approve.

> 1.4 Action items from previous meetings
>
> 2. Main issues (125 min)
>
> 2.1 General logistics
> All C++ reflector are now moved to listserv
>
> https://lists.isocpp.org/mailman/listinfo.cgi/sg19
>
> 2.2 Paper reviews
>
> Any papers proposed for review at COLOGNE? Deadline June 17
>
> 2.2.1: ML topics
> Differentiable Programing by Marco Foco
>
1. wikipedia defines differentiable programming
allows automatic differentiation
used in deep learning
its reborn
2. problem is blackbox evaluation on data
feedback goes through blackbox back to generator
now optimize
synthesis and rendering needs to be done through back propagation
in general cannot back propagate through the rendering

cannot take rendering back to differentiable synthesis
3. solution 1, numeric differentiation pros and cons
exponential complexity on 2nd and 3rd order
4. solution 2 is autodiff libraries
good for 2nd order
need template black magic so heavy compile time
5.solution 3. is differentiable programming
use language primitive
more precise, use llvm-based complers doign SSA level differentiation
6. started in 1968 Maxima, graph-based
pytorch, tensorflow
VLAD
Swift, Julia, Zygote, Halide
added to Swift to say I want this function to be differentiated, will
generate optimized back prop, or can also be generated by hand
separates the kernel from the scheduling
similar in Torch, takes pyCode and converts it to SSA
Frank: this is going to make its way into all important languages soon
swift approach is with differentiable annotation and if you dont provide
the impl yourself, it can generate to see if your fn provide all the
constraints, or a custom one, Jacobian
providing your own impl only good for low level things,
power comes from building a whole NN and just ask for loss function
should be an option for when you can't provide the back prop yourself like
on GPUs, you are not likely to have the back prop
What about data sharing, with complicated fn, gradient use intermediate
results, or fn with multiple argument that you want to differentiate
Not just sharing code, but also storing intermediate results somewhere.
Agree, may be have a closure
need to use a keyword , not likely with generalized attribute
Use Herb's metacommand proposal could also do it
As a primitive (differentiate this functions) also something that modify
the AST, but we cant standardize that
Reflection is only for types, not yet for content of a function
Come from Users, only need to be light weight
Cannot be opaque
we can create a new data type with compile time templates, and can still
manage for differentiable programming, probably expression templates
Problem with Expresison template is it takes a lot of time to compile
a lot of work for the implementers, but it looks like normal code
And you still need to write generic code, and give up typing.
automatic differentiation
All feel we should move forward
and start looking at each languages solution as prior art, and then propose
our C++ solution
Frank will look at what his collegaue did with expression templates

> Richard Dosselman
> Math proposal for Machine Learning
>
> https://docs.google.com/document/d/1VAgcyvL1riMdGz7tQIT9eTtSSfV3CoCEMWKk8GvVuFY/edit
>

Graph proposal by Richard Dosselman
address major items needed for ML
basic statistics
also good for general programming
already in Boost accumulate
python has them
use basic iterators linked list and graph data structure
when do you require from value type of iterator, any that is integrable, no
strings
will this work with integers, depends on fractional, so probaby truncate
and get an integer result
in matrix lib, what happens when we have eigen comes out
T would be iterator of int, but I want float
also can add default value to that template
will need to retweak it for Concepts and ranges may be in a backup session

Median can be implemented for constexpr, and string applied to a mode

allow overloads to enable to not use standard classes and predicates? yes
accumulate fn has that

get mean and standard deviation from random numbers for probability
distributions

just adding a few member functions that would return those,
not hard to compute
are there distributions that dont have these values? yes like cauchy?
What is the plan for them? If we have a Distribution concept in future?
All feel this is OK to forward
do we feel we need a more generic version of that?
why not have kertosis, min, max,
if we add basic 1st 2nd moment
why not the rest, do we have justified motivation

>
>
> Graph Proposal for Machine Learning
>
> https://docs.google.com/document/d/13rdk1Xq8ZshUiTL5QASK1N2yD5bLwK3lQjbDs5yIF6o/edit
>
> Adjacency list, property graph, forward, bidirections, and dynamic graph,
multiple edges
used boost graph 7 years ago, and borrowed some of the concepts like graph
traits
this breaks the compile time cyclic dependency
vertex, edge, graph type and the collection
graph_traits is the type passed to all of them

in most graph package use integer id, for the identifier for vertex, now
can define a pair vertex value type as core that represnets vertex
use variety of containers to store the vertex

how flexible is this for other like adjacency arrays? all your edges are in
one array,
vertexes can be stored in contguous array, but not yet a mechanism for edges
This might be worth adding
Can you build a graph over pointers, like a wrapper, instead of graph
holding the edges, but just hold references
adjacency array define requirements that all edges sorted initially?Not
required, just different tradeoffs
more expensive if you have to insert into middle of an array
lists and arrays have same interface but different guarantees

directed vs undirected graphs might be able to use adapters
have datalayout and code complexity goes out as you add those

vertex_set type shows performance characteristics
based on the underlying collection used
Traversals need a vertex_id on the edge to get constant time lookup

is vertex_id something the user need to know? No just an Implementation
detail

static impl using array is always a question? dont want to box ourselves in
so that we have to create a new data structure
this is the adjacency array or matrix (these these are 2 different things)

should we sketch a design, to see if we can see similar interface for that
traversal is there an interator?
what if visitor pattern, traverse but also create new graph

Looking at Examples, Object API
can see how things are created as well as traversal
this example is just counting, but not transforming
want iterator that goes through all the edges
also want a back inserter to know what source node it came from
can you give me some examples

Algorithm Class Design
Bellman-Ford shortest path
passes weigh1
can define your own, lambda and that will retrieve a value for you

for the kind that traverse and transform, you will need a few more arguments
NO way to enforce in C++ yet , because cannot extract type of a lambda
through decltype
you can do that in C++17, lambda in an unevaluated context

Algorithm class is edge_weight_fnc should be user function, else std
function may have extra costs
I would use muy own type, and not a std function

Should fn parameter pass by value or ref? Good question, not sure

default argument can be std fn

continue forward, but can it make it into standard
graphs have different performance tradeoffs, am I storing vertex data next
to storage array,
so its hard to generalize for

At Kona, Andrew Lumsdaine of Boost graph lib, a big problem in BGL and a
functional programmign approach will help
the way different components interact, just know what not to do
as there are so many types of graphs and trees
for trees and graphs we should have independence of memory layout

defining the concepts is hard, where do you put the limit
but start with something to work with
aim for 90-95 % of problem space

please try to come up with some of those concepts
Also may be have Andrew Lumsdaine look at this
Look into adjacency array. concepts, expand edge list,

Possibly just to a tree first as graph is too complex? Surprising generic
trees are far move difficult then graphs
because trees have additional properties which needs to optimize a lot more
graphs only have 4-5 or 6 items
Going down Concept path, but not the api, trees can be used to constrain
the api, but first with graph with trees inmind.

Vincent to start a paper on trees

D1416R1: SG19 - Linear Algebra for Data Science and Machine Learning
>
> https://docs.google.com/document/d/1IKUNiUhBgRURW-UkspK7fAAyIhfXuMxjk7xKikK4Yp8/edit#heading=h.tj9hitg7dbtr
>
> P1415: Machine Learning Layered list
>
> https://docs.google.com/document/d/1elNFdIXWoetbxjO1OKol_Wj8fyi4Z4hogfj5tLVSj64/edit#heading=h.tj9hitg7dbtr
>
> 2.2.2 SG14 Linear Algebra progress: Bob Steagall
> Different layers of proposal
>
> https://docs.google.com/document/d/1poXfr7mUPovJC9ZQ5SDVM_1Nb6oYAXlK_d0ljdUAtSQ/edit
>
>
> 2.2.3 any other proposal for reviews?
>
>
>
>
>
> 2.3 Other Papers and proposals
>
>
> 2.5 Future F2F meetings:
>
> 2.6 future C++ Standard meetings:
> https://isocpp.org/std/meetings-and-participation/upcoming-meetings
>
>
> - *2019-07-15 to 20: Cologne, Germany; *Nicolai Josuttis
> - *2019-11-04 to 09: Belfast, Northern Ireland;* Archer Yates
>
> - 2020-02-10 to 15: Prague, Czech Republic
- 2020-06-01 to 06: Bulgaria
- 2020-11: (New York, tentative)
- 2021-02-22 to 27: Kona, HI, USA

> 3. Any other business
>
> New reflector
>
> http://lists.isocpp.org/mailman/listinfo.cgi/sg19
>
> Old Reflector
> https://groups.google.com/a/isocpp.org/forum/#!newtopic/sg19
> <https://groups.google.com/a/isocpp.org/forum/?fromgroups=#!forum/sg14>
>
> Code and proposal Staging area
>
> 4. Review
>
> 4.1 Review and approve resolutions and issues [e.g., changes to SG's
> working draft]
>
> 4.2 Review action items (5 min)
>
>
> 5. Closing process
>
>
> 5.1 Establish next agenda
>
> June 13
>
>
> 5.2 Future meeting
> April 11 1-3 ET: Graph design
> May 9
> Jun 13: June 17 Mailing deadline
> Jul 11 - cancelled? C++ Standard Meeting Cologne
> Aug 8
> Sep 12
> Oct 10
> Nov 14 - cancelled due to DST change and switching to a new cycle.
>

Received on 2019-05-09 13:55:59