ISOCPP sg19 List: Re: [isocpp-sg19] [Cxxpanel] Re: Basic Statistics P1708R9

From: Oliver Rosten <oliver.rosten_at_[hidden]>
Date: Fri, 1 Nov 2024 16:55:46 +0000

Hi Mark,

I really appreciate your input!

I am extremely busy over the next 2 weeks but anticipate returning to this,
as a priority, thereafter.

So if you're happy to postpone this discussion slightly, then we can resume
in a couple of weeks (which I'm very much looking forward to as this is
interesting).

Oliver

On Fri, 1 Nov 2024 at 16:09, Mark Hoemmen <mark.hoemmen_at_[hidden]> wrote:

> Hi Oliver! I'd be happy to give feedback on this proposal as well.
> Please feel free to send me drafts for feedback if you like. I'll
> write a little bit here.
>
> Section 5.2 says that the result is unspecified if the range has an
> Inf or NaN in it. WG21 lately has expressed a preference to make
> behavior with Infs and NaNs well-defined, for example in P3008R2
> (atomic floating-point min/max), and even to give users a choice
> between "propagate NaNs" and "treat NaNs as missing values." I agree;
> I think we need good reasons _not_ to make results well-defined.
> Section 4.3 claims that users should just filter ranges by `!
> std::isnan`). Is that what users of other statistical software
> packages expect?
>
> Regarding the "first value" in a reduction, my suggestion is to
> imitate std::reduce. That is, optionally take a "first value" as an
> input parameter.
>
> ```c++
> template<class T, ranges::input_range R>
> constexpr auto mean(R&& r, T init) -> T;
> template<ranges::input_range R>
> constexpr auto mean(R&& r) -> std::ranges::range_value_t<R>;
> ```
>
> This has the following advantages.
>
> 1. Users would not need to give an explicit template parameter `T` for
> the result. They could just pass in a number like 0.0.
> 2. It would support use cases like computing statistics over part of a
> range, then continuing the computation on the "rest" of the range.
> 3. It would help distinguish overloads. The Standard Algorithms
> generally don't have overloads with the same number of arguments but
> different numbers of template parameters.
> 4. The Standard Algorithms don't generally permit explicit template
> arguments.
>
> Regarding accumulator objects, it's not clear how parallelization
> would work. Reducers in the proposal are stateful (because
> `operator()` returns `void`). Thus, parallelization would call for
>
> 1. a way to create a separate copy of the reducer for each thread,
> 2. a way to "initialize" the reducer's state to the identity for the
> reduction operation, and
> 3. a way to combine intermediate reduction results.
>
> It looks like the proposal isn't trying to define a "reducer concept"
> for user-defined reducers. That's fine, but it might help to think
> about how a reducer concept would be defined. I might suggest looking
> at other parallel programming models, such as Kokkos (
>
> https://kokkos.org/kokkos-core-wiki/API/core/builtinreducers/ReducerConcept.html
> ), for inspiration.
>
> Thanks for all your work!
> mfh
>
>
>
> On Tue, Oct 29, 2024 at 11:19 AM Oliver Rosten via SG19
> <sg19_at_[hidden]> wrote:
> >
> > Thanks, Michael! I appreciate your kind words.
> >
> > I was worried it would take me a few days or at worst a few weeks to get
> to the paper. So it looks like my worries were unfounded.
> >
> > I'll put something together as soon as I am able.
> >
> >
> >
> > On Tue, 29 Oct 2024 at 16:48, Michael Wong <fraggamuffin_at_[hidden]>
> wrote:
> >>
> >> No worries, this is a pretty normal process. Remember, we are always
> grateful for ways to improve the paper at any stage. Our biggest fear is
> that it goes out with insufficient voice.
> >>
> >> The reason for the paper is to maintain full transparency, so that we
> can point to it later and say, we saw the written feedback and this is how
> we addressed it, whereas if we hid it inside Sg19, then people might
> question our transparency. The written paper is also important to make sure
> we don't have a constantly moving goal post - where there are further
> requirements such that it is never satisfied. Of course more feedback is
> always welcome, just not keep changing the same feedback once its been
> satisfied.
> >>
> >> But I trust you and we worked together before where you have
> demonstrated your hardwork and honour, so I am quite easy on this though
> other chairs may be much more of strict and demand the paper before the
> next SG19 meeting of Nov 14. I think as long as you have a paper in the
> post Poland mailing of Dec 15, it will be fine. We can take this email as
> the feedback. In the Post Poland mailing, you can even say how we handled
> it and what else is still missing.
> >>
> >>
> >> On Tue, Oct 29, 2024 at 12:05 PM Oliver Rosten <
> oliver.rosten_at_[hidden]> wrote:
> >>>
> >>> Hi Michael,
> >>>
> >>> Thanks for your input. What's the latest I can get away with writing
> this paper?
> >>>
> >>> I know the paper will be short, but I am currently rammed.
> >>>
> >>> Oh, and to dispel any worries: the paper will be a friendly one :)
> >>>
> >>> I want to see the stats functions standardized. I just think they are
> under-spec'd as it stands.
> >>>
> >>> O.
> >>>
> >>> On Tue, 29 Oct 2024 at 15:02, Michael Wong <fraggamuffin_at_[hidden]>
> wrote:
> >>>>
> >>>> Hi feedback is always welcome. I have included the SG19 reflector and
> the author.
> >>>>
> >>>> Procedurally, the right way to do this now, because this paper has
> already been voted out of SG19 several years ago (when Oliver was not
> there) and is now at LEWG or beyond, is to write a counter paper and offer
> it either as a friendly (as in I can go along with what is there but is
> offering these suggested changes to make it better) or hostile (as in I
> will oppose the paper and favor my direction ) amendment.
> >>>>
> >>>> It looks to me that some of the objections are technical, so it
> really should be handled in the SG19 and not in LEWG(these are not naming
> or library specific issues), so I would say still write a D paper listing
> the objections and proposed alternatives for improvement or rejection, but
> because Prof Dosselman has been super responsive, I am sure he will catch
> on to this, and we will still be able to address this in the next SG19 call
> on Nov 14. If at that time we can't resolve this and have an update ready
> for Wroclaw we will inform LEWG to hold.
> >>>>
> >>>>
> >>>> So Oliver, can you write a D paper on this objection pretty much
> saying what you have in the email ( I and Guy can help you structure the
> paper)- reason being that we can post it as a P paper post meeting whether
> you are satisfied with the resolution or not and everyone come on Nov 14 to
> be prepare to discuss it.
> >>>>
> >>>> P.S. I will be away in the opposite time zone on Nov 14 so I might
> get Phil or Guy to chair it.
> >>>>
> >>>> On Tue, Oct 29, 2024 at 8:24 AM Oliver Rosten <
> oliver.rosten_at_[hidden]> wrote:
> >>>>>
> >>>>> @Michael Wong what are your thoughts?
> >>>>>
> >>>>> On Tue, 29 Oct 2024 at 12:18, <guy.davidson_at_[hidden]> wrote:
> >>>>>>
> >>>>>> I think this warrants a brief paper to ensure it is discussed in
> session, but unfortunately the deadline for Wroclaw has passed. I find it
> unlikely that the paper will leave SG19 this time though, so it is still
> worthwhile. Even then, I could request that the paper be brought as a late
> consideration when reviewing the stats paper. Reflectors are becoming busy
> places, even to the most dedicated of followers.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> Cheers,
> >>>>>> G
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> From: Oliver Rosten <oliver.rosten_at_[hidden]>
> >>>>>> Sent: 29 October 2024 12:13
> >>>>>> To: guy.davidson_at_[hidden]; cxxpanel_at_[hidden]
> >>>>>> Subject: Re: [Cxxpanel] Basic Statistics P1708R9
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> Procedurally, what's the best thing to do here?
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> Raise it on a reflector? If so SG19, SG6 or LEWG?
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> Or something else?
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Tue, 29 Oct 2024 at 12:07, <guy.davidson_at_[hidden]> wrote:
> >>>>>>
> >>>>>> I concur.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> I believe there is a SG19 session scheduled for Wroclaw, and if
> not, this will also have to go through SG6. I believe there is still time
> to catch this.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> Cheers,
> >>>>>> G
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> From: Oliver Rosten <oliver.rosten_at_[hidden]>
> >>>>>> Sent: 28 October 2024 15:28
> >>>>>> To: cxxpanel_at_[hidden]
> >>>>>> Subject: [Cxxpanel] Basic Statistics P1708R9
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> Hi folks,
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> I have some concerns about the stats paper. We didn't have time to
> get to it today, but I'd like to share my thoughts here to see what others
> think, before deciding how to proceed.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> Previously, I expressed to the group the BSI my worries about these
> functions potentially exposing a pretty big "Unspecified Behaviour"
> surface. Many of the functions have preconditions that the range they
> consume has at least some minimum number of elements. For example, the mean
> needs a range with at least one element. If the range has no elements, an
> unspecified value is returned.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> IIRC correctly, various options were mentioned when we talked about
> this at the BSI which I communicated to the author: leaving as-is,
> throwing, returning std:expected... The point being that the paper needs a
> proper discussion of the design space and a justification for the choice
> made. Unfortunately, the author seems to have interpreted the subsequent
> communication as an exhortation to use std::expected and has added section
> 4.7, which I do not think properly addresses the actual issue.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> Additionally, in this section the paper then makes what I think is
> a dubious comparison between ranges with insufficient elements and feeding
> NaNs into the statistical functions. Which brings me to the next point.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> The C-math functions are generally very well specified if you feed
> them NaNs/infs and so I think there needs to be some justification for why
> this isn't the case here. Furthermore, the paper is completely silent on
> whether e.g. FE_INVLAID is ever raised; again, a gap which needs to be
> filled.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> Finally, there is the related question of what happens during
> constant evaluation if the preconditions are violated. As far as I can
> tell, the paper is silent on this. Should compilers just return whatever
> unspecified value they like? Or actually are we expecting FE_INVALID to be
> raised meaning it's not a core constant expression? Or...
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> My feeling is that the paper leaves some important questions
> unanswered. Do people concur?
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> Oliver
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> Cxxpanel mailing list -- cxxpanel_at_[hidden]
> >>>>>> To unsubscribe send an email to cxxpanel-leave_at_[hidden]
> >
> > --
> > SG19 mailing list
> > SG19_at_[hidden]
> > https://lists.isocpp.org/mailman/listinfo.cgi/sg19
>

Received on 2024-11-01 16:56:02