On Wed, May 11, 2022 at 9:55 PM Michael Wong <fraggamuffin@gmail.com> wrote:
Hi all, SG19 Machine Learning meeting will focus on stats.
Topic: SG19 monthly
Time: 02:00 PM Eastern Time (US and Canada) 
    Every month on the Second Thu,

1. Opening and introductions

1.1 Roll call of participants
Michael Wong, Richard Dosselman, Ozan Irsoy, Phil Ratzloff, Luke D'Alessandro, Kevin Dewessee, Chris Ryan, Andrew Lumsdaine, Rene Rivera, Ka Ming Chan Jens Maurer

1.2 Adopt agenda

1.3 Approve minutes from previous meeting, and approve publishing
 previously approved minutes to ISOCPP.org

1.4 Action items from previous meetings

2. Main issues (125 min)

2.1 General logistics

Meeting plan, focus on one paper per meeting but does not preclude other
paper updates:

   May 12, 2022 02:00 PM ET: Stats
    June 9, 2022 02:00 PM ET: Graph
    Jul 14, 2022 02:00 PM ET: Matrix, RL and DC
    Aug 11, 2022 02:00 PM ET: Stats
    Sep 13, 2022 02:00 PM ET: Graph
    Oct 12, 2022 02:00 PM ET: Matrix RL/DC

ISO meeting status

future C++ Std meetings

2.2 Paper reviews

2.2.1: ML topics Graph Proposal Phil Ratsloff et al

Latest paper:

Here’s a link to the paper (different than the previous paper reviewed).
Array copy semantics:






Array copy semantics:
array copy-semantics paper P1997 "Relaxing Restrictions on Arrays",

Stats feedback:

on Simple Statistical Functions (p1708r4): Contracts, Exceptions and
Special cases Johan Lundberg

Stats SG6:
met with them a month ago
can we fuse the mean_accumulators into one; construct with no parameter and pass a parameter to it to merge weighed and unweighted version together
but weighted version of variance_accumulator does not have the freedom of 1/(n-1) so will need a different constructor; so decided with SG6 that it is better to have separate weighted and unweighted variance_accumulator
merge the weighted and unweighted kurtosis together? this seems More involved.

clear to move on to LEWG

median quantile and mode are different; lets have a look

median and mode could be multiple values
 so need sorted range
good feel median and quantile fn will work, SG6 concerned 
don't know how many modes there will be
mode can be messy to deal with
user want to bin data into little groups
looking at Boost histogram library 
quantile in sorted order, want .25 (25th element percentage wise) . if even array it would be 2 in the middle, then avg? or return both? 
user provides sorted range, say which quantile you want, must also give us size 
median would be quantile at 0.5, 
convenience function to get multiple quantiles, so pass a range of them, and do one linear scan of them 
does range already have elements in there? there is a concept sized_range means it has a customization point for size
what is Q?  60% quantile, unconstrained so float or double?
why pair in return value why optional? if element is between 2 then return 2 not just one
std::pair should be smallest struct
look at ranges algorithm 
if you dont support ranges that dont meet that concept, then N is appropriate so sized_range could exclude some data type because you might not want to do a scan so may be indicate that through naming
always have to test the optional for the size, so if you just have 2 is simpler? is that confusing? if you have 5 twice
i cant always ignore that optional
can you return a range instead of pair 

disagree with alternative fn overload

1 accumulator that brings 4 ; now has memory allocation concerns but it is an output iterator
or 4 separate accumulator that brings back 1
use case: compute several things over same range, and scan a large range only once
if a random access range, then just jump to where data is ; 
want optimization over random access case
unfriendly interface do we expect a lot of data to be read only once
select algo on unsorted data is O(N) instead of O(nlogn)
dont remove accumulators yet but the sweet spot may be small

naming with sorted range, not on the return; try various naming schemes: quantiles_of_sorted
how about a tag? usually  only for constructors

quantile fn need a constraint must be convertable from value type of range
template parameters have R as first and sometimes last, can we be consistent? so they will go back  Reinforcement Learning Larry Lewis Jorge Silva

Reinforcement Learning proposal: Differential Calculus:

Reinforcement Learning proposal: Differential Calculus:

Current github



Stats review Richard Dosselman et al


Feedback from Johan Lundberg and Oleksandr Korval


P1708R3: Math proposal for Machine Learning: 3rd review

PXXXX: combinatorics: 1st Review

*> std.org/jtc1/sc22/wg21/docs/papers/2020/p1708r2
*> above is the stats paper that was reviewed in Prague*
*> http://wiki.edg.com/bin/view/Wg21prague/P1708R2SG19
*> Review Jolanta Polish feedback.*
*> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p2119r0.html
Matrix paper

2.2.3 any other proposal for reviews?

2.3 Other Papers and proposals

P1416R1: SG19 - Linear Algebra for Data Science and Machine Learning

P1415: Machine Learning Layered list

2.2.2 SG14 Linear Algebra progress:
Different layers of proposal

2.5 Future F2F meetings:

2.6 future C++ Standard meetings:


3. Any other business

New reflector


Old Reflector

Code and proposal Staging area

4. Review

4.1 Review and approve resolutions and issues [e.g., changes to SG's
working draft]

4.2 Review action items (5 min)

5. Closing process

5.1 Establish next agenda

5.2 Future meeting

