Date: Mon, 13 Mar 2023 20:23:32 +0000
Hi all,
As requested at the meeting last Thursday, I'm sending round the feedback I
have on the Simple Statistics & More Simple Statistics papers [P1708,
P2681].
FWIW, I personally think there is real value in standardizing this.
However, I am very nervous about the design decision to make the size of
various ranges a precondition for certain functions.
I think this is brought into sharp focus by the recent discussions of
safety. Bearing in mind that there are functions requiring a range of
length 1, 2 or 3, what is the diligent, safety conscious programmer to do?
I would argue that, at the very least, programmers should protect
against violation of the aforementioned preconditions. But if the minimal
approach is taken here, it isn't future proof: perhaps someone's code today
only involves the mean and so only needs protecting against empty ranges.
But perhaps tomorrow this code also involves the covariance and the day
after that skewness. In the minimal approach, each time the precondition
checking would need to be updated, which feels brittle.
I think a better approach here is for the various statistical functions to
return a std::expected. This seems like an ideal opportunity to use this
new library feature and would make code safe by default for very little
additional overhead.
My own experience of implementing a (very) modest version of aspects of
this proposal in the past was that empty ranges did find their way into the
stats functions and so protection was necessary. I decided to return an
optional (but would now go for an expected); my feeling was/is that dealing
with a range of insufficient size can reasonably be done locally, by the
caller, in this particular scenario.
At the very least, I think the various options (as-is / std::expected /
throwing) should be discussed in the papers, since the design space here is
not trivial!
All the best,
Oliver
As requested at the meeting last Thursday, I'm sending round the feedback I
have on the Simple Statistics & More Simple Statistics papers [P1708,
P2681].
FWIW, I personally think there is real value in standardizing this.
However, I am very nervous about the design decision to make the size of
various ranges a precondition for certain functions.
I think this is brought into sharp focus by the recent discussions of
safety. Bearing in mind that there are functions requiring a range of
length 1, 2 or 3, what is the diligent, safety conscious programmer to do?
I would argue that, at the very least, programmers should protect
against violation of the aforementioned preconditions. But if the minimal
approach is taken here, it isn't future proof: perhaps someone's code today
only involves the mean and so only needs protecting against empty ranges.
But perhaps tomorrow this code also involves the covariance and the day
after that skewness. In the minimal approach, each time the precondition
checking would need to be updated, which feels brittle.
I think a better approach here is for the various statistical functions to
return a std::expected. This seems like an ideal opportunity to use this
new library feature and would make code safe by default for very little
additional overhead.
My own experience of implementing a (very) modest version of aspects of
this proposal in the past was that empty ranges did find their way into the
stats functions and so protection was necessary. I decided to return an
optional (but would now go for an expected); my feeling was/is that dealing
with a range of insufficient size can reasonably be done locally, by the
caller, in this particular scenario.
At the very least, I think the various options (as-is / std::expected /
throwing) should be discussed in the papers, since the design space here is
not trivial!
All the best,
Oliver
Received on 2023-03-13 20:23:46