ISOCPP sg19 List: Re: SG19 ML May 12 meeting

From: Michael Wong <fraggamuffin_at_[hidden]>
Date: Thu, 12 May 2022 15:04:44 -0400

On Wed, May 11, 2022 at 9:55 PM Michael Wong <fraggamuffin_at_[hidden]> wrote:

> Hi all, SG19 Machine Learning meeting will focus on stats.
> Michael Wong is inviting you to a scheduled Zoom meeting.
>
> Topic: SG19 monthly
> Time: 02:00 PM Eastern Time (US and Canada)
> Every month on the Second Thu,
>
>
> Join from PC, Mac, Linux, iOS or Android:
>
> https://iso.zoom.us/j/93084591725?pwd=K3QxZjJlcnljaE13ZWU5cTlLNkx0Zz09
> Password: 035530
>
> Or iPhone one-tap :
> US: +13017158592,,93084591725# or +13126266799,,93084591725#
> Or Telephone:
> Dial(for higher quality, dial a number based on your current location):
> US: +1 301 715 8592 or +1 312 626 6799 or +1 346 248 7799 or +1
> 408 638 0968 or +1 646 876 9923 or +1 669 900 6833 or +1 253 215 8782
> or 877 853 5247 (Toll Free)
> Meeting ID: 930 8459 1725
> Password: 035530
> International numbers available: https://iso.zoom.us/u/agewu4X97
>
> Or Skype for Business (Lync):
> https://iso.zoom.us/skype/93084591725
>
> Agenda:
>
> 1. Opening and introductions
>
> The ISO Code of conduct:
> https://www.iso.org/files/live/sites/isoorg/files/store/en/PUB100397.pdf
>
> IEC Code of Conduct:
>
> https://www.iec.ch/basecamp/iec-code-conduct-technical-work
>
> ISO patent policy.
>
>
> https://isotc.iso.org/livelink/livelink/fetch/2000/2122/3770791/Common_Policy.htm?nodeid=6344764&vernum=-2
>
> The WG21 Practices and Procedures and Code of Conduct:
>
> https://isocpp.org/std/standing-documents/sd-4-wg21-practices-and-procedures
>
> 1.1 Roll call of participants
>
Michael Wong, Richard Dosselman, Ozan Irsoy, Phil Ratzloff, Luke
D'Alessandro, Kevin Dewessee, Chris Ryan, Andrew Lumsdaine, Rene Rivera, Ka
Ming Chan Jens Maurer

>
> 1.2 Adopt agenda
>
> 1.3 Approve minutes from previous meeting, and approve publishing
> previously approved minutes to ISOCPP.org
>
> 1.4 Action items from previous meetings
>
> 2. Main issues (125 min)
>
> 2.1 General logistics
>
> Meeting plan, focus on one paper per meeting but does not preclude other
> paper updates:
>
>
> May 12, 2022 02:00 PM ET: Stats
> June 9, 2022 02:00 PM ET: Graph
> Jul 14, 2022 02:00 PM ET: Matrix, RL and DC
> Aug 11, 2022 02:00 PM ET: Stats
> Sep 13, 2022 02:00 PM ET: Graph
> Oct 12, 2022 02:00 PM ET: Matrix RL/DC
>
> ISO meeting status
>
> future C++ Std meetings
>

- 2022-11-07 to 12: Kona, HI, USA
<https://isocpp.org/files/papers/N4912.pdf>: Standard C++ Foundation

>
> 2.2 Paper reviews
>
> 2.2.1: ML topics
>
> 2.2.1.1 Graph Proposal Phil Ratsloff et al
>
> Latest paper:
>
> Here’s a link to the paper (different than the previous paper reviewed).
> There are some additional updates I’m planning on making before the
> meeting.
>
>
> https://docs.google.com/document/d/1OpH-xxRri7tJTtJJIZTYmSHkkrZJkdBwm9zJ7LqolfQ/edit?usp=sharing
>
>
>
>
> P1709R3:
>
> https://docs.google.com/document/d/1kLHhbSTX7j0tPeTYECQFSNx3R35Mu3xO5_dyYdRy4dM/edit?usp=sharing
>
>
> https://docs.google.com/document/d/1QkfDzGyfNQKs86y053M0YHOLP6frzhTJqzg1Ug_vkkE/edit?usp=sharing
>
> <http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p2119r0.html>
>
> <
>
> https://docs.google.com/document/d/175wIm8o4BNGti0WLq8U6uZORegKVjmnpfc-_E8PoGS0/edit?ts=5fff27cd#heading=h.9ogkehmdmtel
> *>*
>
> Array copy semantics:
> array copy-semantics paper P1997 "Relaxing Restrictions on Arrays",
> https://wg21.link/p1997
>
> Stats feedback:
>
> P2376R0
> <http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2376r0.pdf>
> Comments
> on Simple Statistical Functions (p1708r4): Contracts, Exceptions and
> Special cases Johan Lundberg
>

Stats SG6:
met with them a month ago
can we fuse the mean_accumulators into one; construct with no parameter and
pass a parameter to it to merge weighed and unweighted version together
but weighted version of variance_accumulator does not have the freedom
of 1/(n-1) so will need a different constructor; so decided with SG6 that
it is better to have separate weighted and unweighted variance_accumulator
merge the weighted and unweighted kurtosis together? this seems More
involved.

clear to move on to LEWG

median quantile and mode are different; lets have a look

median and mode could be multiple values
so need sorted range
good feel median and quantile fn will work, SG6 concerned
don't know how many modes there will be
mode can be messy to deal with
user want to bin data into little groups
looking at Boost histogram library
quantile in sorted order, want .25 (25th element percentage wise) . if even
array it would be 2 in the middle, then avg? or return both?
user provides sorted range, say which quantile you want, must also give us
size
median would be quantile at 0.5,
convenience function to get multiple quantiles, so pass a range of them,
and do one linear scan of them
does range already have elements in there? there is a concept sized_range
means it has a customization point for size
what is Q? 60% quantile, unconstrained so float or double?
why pair in return value why optional? if element is between 2 then return
2 not just one
std::pair should be smallest struct
look at ranges algorithm
if you dont support ranges that dont meet that concept, then N is
appropriate so sized_range could exclude some data type because you might
not want to do a scan so may be indicate that through naming
always have to test the optional for the size, so if you just have 2 is
simpler? is that confusing? if you have 5 twice
i cant always ignore that optional
can you return a range instead of pair

disagree with alternative fn overload

1 accumulator that brings 4 ; now has memory allocation concerns but it is
an output iterator
or 4 separate accumulator that brings back 1
use case: compute several things over same range, and scan a large range
only once
if a random access range, then just jump to where data is ;
want optimization over random access case
unfriendly interface do we expect a lot of data to be read only once
select algo on unsorted data is O(N) instead of O(nlogn)
dont remove accumulators yet but the sweet spot may be small

naming with sorted range, not on the return; try various naming schemes:
quantiles_of_sorted
how about a tag? usually only for constructors

quantile fn need a constraint must be convertable from value type of range
template parameters have R as first and sometimes last, can we be
consistent? so they will go back

>
> 2.2.1.2 Reinforcement Learning Larry Lewis Jorge Silva
>
> Reinforcement Learning proposal:
>
> 2.2.1.3 Differential Calculus:
>
>
> https://docs.google.com/document/d/175wIm8o4BNGti0WLq8U6uZORegKVjmnpfc-_E8PoGS0/edit?ts=5fff27cd#heading=h.9ogkehmdmtel
>
> 2.2.1.4: Stats paper
>
> Current github
>
> https://github.com/cplusplus/papers/issues/475
>
> https://github.com/cplusplus/papers/issues/979
>
> Stats review Richard Dosselman et al
>
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p1708r4.pdf
>
> Feedback from Johan Lundberg and Oleksandr Korval
>
> https://isocpp.org/files/papers/D2376R0.pdf
>
> P1708R3: Math proposal for Machine Learning: 3rd review
>
> PXXXX: combinatorics: 1st Review
>
> *> std.org/jtc1/sc22/wg21/docs/papers/2020/p1708r2
> <http://std.org/jtc1/sc22/wg21/docs/papers/2020/p1708r2>*
> *> above is the stats paper that was reviewed in Prague*
> *> http://wiki.edg.com/bin/view/Wg21prague/P1708R2SG19
> <http://wiki.edg.com/bin/view/Wg21prague/P1708R2SG19>*
> *>*
> *> Review Jolanta Polish feedback.*
> *> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p2119r0.html
> <http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p2119r0.html>*
>
>
> 2.2.1.4: Matrix paper
>
> 2.2.3 any other proposal for reviews?
>
> 2.3 Other Papers and proposals
>
> P1416R1: SG19 - Linear Algebra for Data Science and Machine Learning
>
> https://docs.google.com/document/d/1IKUNiUhBgRURW-UkspK7fAAyIhfXuMxjk7xKikK4Yp8/edit#heading=h.tj9hitg7dbtr
>
> P1415: Machine Learning Layered list
>
> https://docs.google.com/document/d/1elNFdIXWoetbxjO1OKol_Wj8fyi4Z4hogfj5tLVSj64/edit#heading=h.tj9hitg7dbtr
>
> 2.2.2 SG14 Linear Algebra progress:
> Different layers of proposal
>
> https://docs.google.com/document/d/1poXfr7mUPovJC9ZQ5SDVM_1Nb6oYAXlK_d0ljdUAtSQ/edit
>
> 2.5 Future F2F meetings:
>
> 2.6 future C++ Standard meetings:
> https://isocpp.org/std/meetings-and-participation/upcoming-meetings
>
> None
>
> 3. Any other business
>
> New reflector
>
> http://lists.isocpp.org/mailman/listinfo.cgi/sg19
>
> Old Reflector
> https://groups.google.com/a/isocpp.org/forum/#!newtopic/sg19
> <https://groups.google.com/a/isocpp.org/forum/?fromgroups=#!forum/sg14>
>
> Code and proposal Staging area
>
> 4. Review
>
> 4.1 Review and approve resolutions and issues [e.g., changes to SG's
> working draft]
>
> 4.2 Review action items (5 min)
>
> 5. Closing process
>
> 5.1 Establish next agenda
>
>
> 5.2 Future meeting
>
>
>
>
> May 12, 2022 02:00 PM ET: Stats
> June 9, 2022 02:00 PM ET: Graph
> Jul 14, 2022 02:00 PM ET: Matrix, RL and DC
> Aug 11, 2022 02:00 PM ET: Stats
> Sep 13, 2022 02:00 PM ET: Graph
> Oct 12, 2022 02:00 PM ET: Matrix RL/DC
>

Received on 2022-05-12 19:04:57