On Wed, May 8, 2019 at 10:19 AM Michael Wong <fraggamuffin@gmail.com> wrote:

SG19 Machine Learning 2 hours

Topic: ISOCPP SG19 Machine Learning
Time: May 9, 2019 1:00 PM Eastern Time (US and Canada)
    Every month on the Second Thu, until Sep 12, 2019, 5 occurrence(s)
    May 9, 2019 1:00 PM
    Jun 13, 2019 1:00 PM
    Jul 11, 2019 1:00 PM
    Aug 8, 2019 1:00 PM
    Sep 12, 2019 1:00 PM
    Please download and import the following iCalendar (.ics) files to your calendar system.
    Monthly: https://iso.zoom.us/meeting/405838761/ics?icsToken=fe375cbeab0268c23180f074fc6795e459a2bcd0fd6bdf9a8e146fa2acd22ef0

Join from PC, Mac, Linux, iOS or Android: https://iso.zoom.us/j/405838761

Or iPhone one-tap :
    US: +14086380968,,405838761#  or +16468769923,,405838761#
Or Telephone:
    Dial(for higher quality, dial a number based on your current location):
        US: +1 408 638 0968  or +1 646 876 9923  or +1 669 900 6833  or 877 853 5247 (Toll Free) or 877 369 0926 (Toll Free)
    Meeting ID: 405 838 761
    International numbers available: https://zoom.us/u/abhaIjFKLZ

Or Skype for Business (Lync):
    https://iso.zoom.us/skype/405838761






Agenda:

1. Opening and introductions 

1.1 Roll call of participants

Frank Seide, Phil Ratzloff, David Gilles, Kirsten Lee, Marco Foco, Richard Dosselmann, Ronan Keryell, Sebastian Messmer, Michael Wong, Vincent Reverdy


1.2 Adopt agenda

Approve


1.3 Approve minutes from previous meeting, and approve publishing  previously approved minutes to ISOCPP.org

Approve.

1.4 Action items from previous meetings

2. Main issues (125 min)

2.1 General logistics

All C++ reflector are now moved to listserv

https://lists.isocpp.org/mailman/listinfo.cgi/sg19

2.2 Paper reviews


Any papers proposed for review at COLOGNE? Deadline June 17

2.2.1: ML topics
Differentiable Programing by Marco Foco
1. wikipedia defines differentiable programming
allows automatic differentiation
used in deep learning
its reborn
2. problem is blackbox evaluation on data
feedback goes through blackbox back to generator
now optimize
synthesis and rendering needs to be done through back propagation
in general cannot back propagate through the rendering

cannot take rendering back to differentiable synthesis
3. solution 1, numeric differentiation pros and cons
exponential complexity on 2nd and 3rd order
4. solution 2 is autodiff libraries
good for 2nd order
need template black magic so heavy compile time
5.solution 3. is differentiable programming
use language primitive
more precise, use llvm-based complers doign SSA level differentiation
6. started in 1968 Maxima, graph-based
pytorch, tensorflow
VLAD
Swift, Julia, Zygote, Halide
added to Swift to say I want this function to be differentiated, will generate optimized back prop, or can also be generated by hand
separates the kernel from the scheduling
similar in Torch, takes pyCode and converts it to SSA
Frank: this is going to make its way into all important languages soon
swift approach is with differentiable annotation and if you dont provide the impl yourself, it can generate to see if your fn provide all the constraints, or a custom one, Jacobian 
providing your own impl only good for low level things,
power comes from building a whole NN and just ask for loss function
should be an option for when you can't provide the back prop yourself like on GPUs, you are not likely to have the back prop
What about data sharing, with complicated fn, gradient use intermediate results, or fn with multiple argument that you want  to differentiate
Not just sharing code, but also storing intermediate results somewhere. Agree, may be have a closure
need to use a keyword , not likely with generalized attribute
Use Herb's metacommand proposal could also do it
As a primitive (differentiate this functions) also something that modify the AST, but we cant standardize that
Reflection is only for types, not yet for content of a function
Come from Users, only need to be light weight
Cannot be opaque
we can create a new data type with compile time templates, and can still manage for differentiable programming, probably expression templates
Problem with Expresison template is it takes a lot of time to compile
a lot of work for the implementers, but it looks like normal code
And you still need to write generic code, and give up typing.
automatic differentiation
All feel we should move forward
and start looking at each languages solution as prior art, and then propose our C++ solution
Frank will look at what his collegaue did with expression templates



Graph proposal by Richard Dosselman
address major items needed for ML
basic statistics
also good for general programming
already in Boost accumulate
python has them
use basic iterators linked list and graph data structure
when do you require from value type of iterator, any that is integrable, no strings
will this work with integers, depends on fractional, so probaby truncate and get an integer result
in matrix lib, what happens when we have eigen comes out
T would be iterator of int, but I want float
also can add default value to that template
will need to retweak it for Concepts and ranges may be in a backup session

Median can be implemented for constexpr, and string applied to a mode

allow overloads to enable to not use standard classes and predicates? yes accumulate fn has that

get mean and standard deviation from random numbers for probability distributions

just adding a few member functions that would return those,
not hard to compute
are there distributions that dont have these values? yes like cauchy?
What is the plan for them? If we have a Distribution concept in future?
All feel this is OK to forward
do we feel we need a more generic version of that?
why not have kertosis, min, max,
if we add basic 1st 2nd moment
why not the rest, do we have justified motivation








 
Adjacency list, property graph, forward, bidirections, and dynamic graph, multiple edges
used boost graph 7 years ago, and borrowed some of the concepts like graph traits
this breaks the compile time cyclic dependency
vertex, edge, graph type and the collection
graph_traits is the type passed to all of them

in most graph package use integer id, for the identifier for vertex, now can define a pair vertex value type as core that represnets vertex
use variety of containers to store the vertex

how flexible is this for other like adjacency arrays? all your edges are in one array,
vertexes can be stored in contguous array, but not yet a mechanism for edges
This might be worth adding
Can you build a graph over pointers, like a wrapper,  instead of graph holding the edges, but just hold references
adjacency array define requirements that all edges sorted initially?Not required, just different tradeoffs
more expensive if you have to insert into middle of an array
lists and arrays have same interface but different guarantees

directed vs undirected graphs might be able to use adapters
have datalayout and code complexity goes out as you add those

vertex_set type shows performance characteristics
based on the underlying collection used
Traversals need a vertex_id on the edge to get constant time lookup

is vertex_id something the user need to know? No just an Implementation detail

static impl using array is always a question? dont want to box ourselves in so that we have to create a new data structure
this is the adjacency array or matrix (these these are 2 different things)

should we sketch a design, to see if we can see similar interface for that
traversal is there an interator?
what if visitor pattern, traverse but also create new graph

Looking at Examples, Object API
can see how things are created as well as traversal
this example is just counting, but not transforming
want iterator that goes through all the edges
also want a back inserter to know what source node it came from
can you give me some examples

Algorithm Class Design
Bellman-Ford shortest path
passes weigh1
can define your own, lambda and that will retrieve a value for you

for the kind that traverse and transform, you will need a few more arguments
NO way to enforce in C++ yet , because cannot extract type of a lambda through decltype
you can do that in C++17, lambda in an unevaluated context

Algorithm class is edge_weight_fnc should be user function, else std function may have extra costs
I would use muy own type, and not a std function

Should fn parameter pass by value or ref? Good question, not sure

default argument can be std fn

continue forward, but can it make it into standard
graphs have different performance tradeoffs, am I storing vertex  data next to storage array,
so its hard to generalize for

At Kona, Andrew Lumsdaine of Boost graph lib, a big problem in BGL and a functional programmign approach will help
the way different components interact, just know what not to do
as there are so many types of graphs and trees
for trees and graphs we should have independence of memory layout

defining the concepts is hard, where do you put the limit
but start with something to work with
aim for 90-95 % of problem space

please try to come up with some of those concepts
Also may be have Andrew Lumsdaine look at this
Look into adjacency array. concepts, expand edge list,


Possibly just to a tree first as graph is too complex? Surprising generic trees are far move difficult then graphs
because trees have additional properties which needs to optimize a lot more
graphs only have 4-5 or 6 items
Going down Concept path, but not the api, trees can be used to constrain the api, but first with graph with trees inmind.

 Vincent to start a paper on trees

D1416R1: SG19 - Linear Algebra for Data Science and Machine Learning

P1415: Machine Learning Layered list

2.2.2 SG14 Linear Algebra progress: Bob Steagall
Different layers of proposal


2.2.3 any other proposal for reviews?





2.3 Other Papers and proposals


2.5 Future F2F meetings:


2.6 future C++ Standard meetings:

https://isocpp.org/std/meetings-and-participation/upcoming-meetings

  • 2019-07-15 to 20: Cologne, Germany; Nicolai Josuttis
  • 2019-11-04 to 09: Belfast, Northern Ireland; Archer Yates
  • 2020-02-10 to 15: Prague, Czech Republic
  • 2020-06-01 to 06: Bulgaria
  • 2020-11: (New York, tentative)
  • 2021-02-22 to 27: Kona, HI, USA
  •  

    3. Any other business 

    New reflector

    http://lists.isocpp.org/mailman/listinfo.cgi/sg19

    Old Reflector
    https://groups.google.com/a/isocpp.org/forum/#!newtopic/sg19

    Code and proposal Staging area

    4. Review

    4.1 Review and approve resolutions and issues [e.g., changes to SG's working draft]

    4.2 Review action items (5 min)


    5. Closing process


    5.1 Establish next agenda 

    June 13


    5.2 Future meeting

    April 11 1-3 ET: Graph design
    May 9
    Jun 13: June 17 Mailing deadline
    Jul 11 - cancelled? C++ Standard Meeting Cologne
    Aug 8
    Sep 12
    Oct 10
    Nov 14 - cancelled due to DST change and switching to a new cycle.