C++ Logo

sg19

Advanced search

Re: [SG19] P1708R4 free functions feedback

From: Oleksandr Koval <oleksandr.koval.dev_at_[hidden]>
Date: Mon, 3 May 2021 20:51:59 +0300
Thank you, joined.

On Mon, May 3, 2021 at 6:59 PM Michael Wong <michael_at_[hidden]> wrote:

> Hi Oleksandr, thank you for the implementation feedback. You can also join
> the SG19 discussion directly as it is open to all:
>
> You can join SG19 here:
>
> https://lists.isocpp.org/mailman/listinfo.cgi/sg19
>
>
>
>
>
> The next call will be on May 13. Once you join the mailing list, you will
> receive the invite and we should have a chance to discuss it directly.
>
>
>
> Cheers
>
>
>
> --
>
> Mit freundlichen Grüßen/ With best regards/ 祝一切顺利 / Yoroshiku onegai-shimasu / Cordialement / Gracias / Большое Спасибо
>
> Distinguished Engineer
>
> Industry Expert
>
> Director, VP, ISOCPP.org <http://isocpp.org/>
>
> ISO C++ Directions Group Chair
>
> Chair of Khronos SYCL C++ Heterogeneous Programming Language
>
> Chair of WG21 C++ SG19: Machine Learning
>
> Chair of WG21 C++ SG14: Low-latency/Games Dev/Financial/Trading/Embedded/Simulation
>
> Editor SG5 Transactional Memory TS
>
> Editor SG1 Concurrency TS
>
> Chair of Standards Council Canada SC22 Programming Languages
>
> Chair of Standards Council Canada TC22/SC32 Electrical and electronic
> components (SOTIF) Chair of UL4600 Object Tracking
> http://wongmichael.com/about
>
>
>
> Michael Wong
>
> Codeplay Software Ltd
>
> Level C Argyle House, 3 Lady Lawson Street,
>
> Edinburgh, EH3 9DR
>
> Tel: 0131 466 0503
>
> Fax: 0131 557 6600
>
> Website: http://www.codeplay.com
>
> Twitter: https://twitter.com/codeplaysoft
>
>
>
> This email and any attachments may contain confidential and /or privileged
> information and is for use by the addressee only. If you are not the
> intended recipient, please notify Codeplay Software Ltd immediately and
> delete the message from your computer. You may not copy or forward it, or
> use or disclose its contents to any other person. Any views or other
> information in this message which do not relate to our business are not
> authorized by Codeplay software Ltd, nor does this message form part of any
> contract unless so stated. As internet communications are capable of data
> corruption Codeplay Software Ltd does not accept any responsibility for any
> changes made to this message after it was sent. Please note that Codeplay
> Software Ltd does not accept any liability or responsibility for viruses
> and it is your responsibility to scan any attachments. Company registered
> in England and Wales, number: 04567874 Registered office: 81 Linkfield
> Street, Redhill RH1 6BY
>
>
>
> *From: *dosselmr_at_[hidden]
> *Sent: *Monday, May 3, 2021 11:45 AM
> *To: *Oleksandr Koval <oleksandr.koval.dev_at_[hidden]>
> *Cc: *chiu_at_[hidden]; Larry.Lewis_at_[hidden]; Jens.Maurer_at_[hidden];
> eniebler_at_[hidden]; phil.ratzloff_at_[hidden]; vreverdy_at_[hidden]; Michael
> Wong <michael_at_[hidden]>; sg19_at_[hidden]
> *Subject: *Re: P1708R4 free functions feedback
>
>
>
> Morning.
> Thanks for your very detailed feedback. I will have a better look at
> your comments when I have an available moment.
> Thanks,
> Richard
> Quoting Oleksandr Koval <oleksandr.koval.dev_at_[hidden]>:
>
> > Hi, I needed similar stats functions in my project so I tried to
> implement
> > proposed free functions from P1708R4. I faced several
> > problems/typos/questions, maybe my experience can be useful for the
> future
> > development of this proposal.
> > General things
> >
> > 1.
> >
> > Since C++20 constraint for ExecutionPolicy algorithms has been
> updated to
> > std::is_execution_policy_v<std::remove_cvref_t<ExecutionPolicy>>
> instead
> > of std::is_execution_policy_v<std::decay_t<ExecutionPolicy>>.
> > 2.
> >
> > Shouldn't stats_error have const char*/const
> > std::string&/string_view constructor
> > to report what?
> > 3.
> >
> > I believe that various(if any) throwing checks are redundant. Range
> size
> > checks are cheap only for random access iterators. No existing
> algorithm
> > throws when range(s) has bad size. The common case here is when
> values and
> > weights ranges have different sizes, I think we can just leave it UB.
> > Another one is when range has 0 or < N elements. In case of mean() we
> > can just divide 0/0 and get inf for empty range. No exceptions are
> > needed. Range value checks are expensive, since we're dealing
> > with floating
> > point types there's no problem with meaningless operations like
> > square root
> > of negative value.
> > 4.
> >
> > Use std::conditional_t in function signatures instead of
> > conditional<>::type.
> >
> >
> > Iterator kind requirements
> >
> > 1.
> >
> > All existing algorithms with ExecutionPolicy support require forward
> > iterator, not input iterator.
> > 2.
> >
> > Review iterator requirements. For some functions it's very hard if
> > possible at all to support input iterator. For example, unweighted
> > sorted_quantile(r,
> > q, proj, interpolation). All it needs is the size of the range to
> select
> > proper element(s). For input iterator it needs to iterate over all
> > elements just to determine range size, but it's not possible to select
> > proper element(s) afterward because iterators are not valid anymore.
> The
> > only option I can see here is to copy the whole range into vector,
> such
> > an algorithm has O(n) space complexity.
> >
> > Mean
> >
> > 1. I think it would be nice to provide some overflow guarantees to
> > encourage implementations to avoid it. For example, use the recurrence
> > relation formula for mean(): mean(n) = mean(n-1) + (data[n] -
> > mean(n-1))/(n+1). Maybe it's better to provide specific formulas for
> > each function instead of general ones in the Part 3 of this proposal.
> > Another option is to specify under what conditions overflow is
> allowed.
> >
> > Quantile
> >
> > 1.
> >
> > Make interpolation_t a scoped enum and remove trailing _t from
> > enumerator names.
> > 2.
> >
> > Typo, it still returns tuple instead of Result although interpolation
> is
> > used.
> > 3.
> >
> > Probably a typo, since interpolation is used, constraint for elements
> > should be updated. Here's the fixed version of (1):
> >
> > template<typename R, typename P = std::identity, typename Q,
> > // result_t is just an alias for `conditional_t<...>`
> > typename Result = result_t<R, P>>// `stats_frange` is like
> > `stats_range` but requires forward_iteratorrequires stats_frange<R, P>
> > && std::is_floating_point_v<Q>constexpr auto sorted_quantile(R&& r, Q
> > q, P proj = {},
> > interpolation_t interpolation = interpolation_t::linear) -> Result;
> >
> >
> > 1.
> >
> > Consider returning iterators instead of potential interpolation of
> pure
> > values. Interpolation with existing modes(higher, lower, etc.) can be
> > provided as a separate function. For example, std::max_element() is
> also
> > supposed to be a numeric function, but it returns an iterator.
> > One approach
> > for this is to return std::pair<std::pair<range_iter, q_value>,
> > std::pair<range_iter, q_value>> for unweighted versions and
> > std::pair<std::pair<range_iter,
> > weight_sum>, std::pair<range_iter, weight_sum>> for weighted ones.
> p.first
> > == end means *no answer*, p.first == p.second means *single*
> > answer, p.first
> > != p.second means *interpolation* is needed.
> > 2.
> >
> > Related to issue 6. How to write ExecutionPolicy version for an
> > unweighted sorted quantile/median algorithm? All it does is finds
> range
> > size and then reads one or two values, there are no parallel versions
> for
> > std::ranges::distance() and std::ranges::advance().
> > 3.
> >
> > Maybe it's obvious but specify that weighted quantiles can be >= 1.
> > Because in section 3.2 there is 0 <= q <= 1 requirement.
> > 4.
> >
> > Clarify unsorted_ versions. One option to implement them is in terms
> of
> > nth_element() which mutates the original sequence. Another option is
> to
> > use additional vector of indices and sort them, but it requires O(n)
> > space.
> > 5.
> >
> > What's the purpose of typename T = double in unweighted versions? It
> can
> > be just typename T.
> > 6.
> >
> > Why is there a version sorted_quantile(r, q, interpolation) without
> > projection? I think that wherever we have any range, we should have
> > projection for it as well, and versions without projections should be
> > completely removed. The same for sorted_quantiles() et al., there
> should
> > be projections for values, quantiles and weights ranges.
> > 7.
> >
> > How about unsorted_quantiles()? When the number of quantiles is small,
> > it can be more efficient than sorting and using sorted_ versions.
> > 8.
> >
> > Clarify weighted sorted_ algorithms. In particular, the case when
> > weights are zeros. For example, weighted median for weights
> > {1,0,1,0} should
> > be interpolation of 1st and 3rd elements. Clarify what should be done
> in
> > the case when weight of the first element is already more than target
> > quantile. I suppose that in such a case there's nothing to
> interpolate and
> > the first element should be returned. Think about the complexity of
> the
> > unsorted version of this approach. It might require up to N-1
> additional
> > calls to nth_element() in case when a lot of elements have zero
> weight.
> >
> > I haven't implemented the following functions, hence, not so much I can
> say
> > for them, only minor/interface issues.
> > Mode
> >
> > 1.
> >
> > Returning through the output iterator seems extremely inconvenient.
> What
> > if the client needs only one mode or at most 3 modes? Maybe it will be
> > better to provide 3 kinds of functions: return through output
> iterator,
> > return by value, return through output range. In the last case,
> function
> > can put more modes while range is not full.
> > 2.
> >
> > Why does it require equality_comparable<iter_value<R>> if we need to
> > compare projected<R,P>? Seems that the idea was to store iter_value in
> > the hash map but it can be very inefficient. Why does it require
> > std::output_iterator<O,
> > std::iter_value_t<R>? Why does it output non-projected values? I think
> > that it should either operate/output on projected values or on
> iterators
> > (not pointed values!) to non-projected values. The latter will
> require at
> > least forward iterator. The same applies to weighted version, why
> require
> > std::equality_comparable<std::iter_value_t<Weights>>?
> > 3.
> >
> > Why does (2, 4) require exactly std::unordered_map. Is it possible to
> > allow a custom hash map and specify what operations it needs to
> support?
> >
> > Skewness
> >
> > 1. Make data_t a scoped enumeration. Choose a better name for it,
> remove
> > trailing _t from enumerator names.
> >
> > Kurtosis
> >
> > 1. Make excess param a scoped enumeration instead of bool.
> >
> > Variance
> >
> > 1. Is var() really better than variance()?
> >
> >
> > --
> > Regards,
> > Oleksandr Koval.
> >
>
>
>
> -----------------------------------------------------
> This e-mail was sent via the Secure Web Server at the
> University of Regina, Department of Computer Science
> https://www.cs.uregina.ca/WebMail/
>
>
>
>


-- 
Regards,
Oleksandr Koval.

Received on 2021-05-03 12:52:14