C++ Logo

std-proposals

Advanced search

Re: Distributed random number ordering

From: Moritz Klammler <moritz_at_[hidden]>
Date: Wed, 12 May 2021 22:49:18 +0200
On 5/12/21 10:29 PM, Lénárd Szolnoki via Std-Proposals wrote:
> Hi,
>
> On the flip side mandating the algorithms also closes off further optimization opportunities from implementations. Also I wouldn't say that there are obvious algorithms to standardize even for discrete distributions.
>
> What comes to mind is adding an optional template parameter for distributions specifying a specific algorithm while having an implementation defined default.

This seems like a sensible option to me. (Assuming that a specific
algorithm can be reliably described in the standard to begin with.)

Not only would it avoid breaking changes (if the default continues to be
the existing implementation-defined behavior) it would also give
implementers freedom to /not/ optimize and make the standardized
algorithm their default as well, thus reducing their maintenance costs.

Going the opposite direction, implementations could also add their own
implementation-defined choices on top of "default" and "standard".

> Cheers,
> Lénárd
>
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> *From:* Moritz Klammler via Std-Proposals <std-proposals_at_[hidden]>
> *Sent:* May 12, 2021 8:25:53 PM GMT+01:00
> *To:* std-proposals_at_[hidden]
> *Cc:* Moritz Klammler <moritz_at_[hidden]>
> *Subject:* Re: [std-proposals] Distributed random number ordering
>
> The fact that the random /engines/ do have their algorithms specified
> but the random /distributions/ don't has caused me discomfort in the
> past as well. As you say, it makes writing code that wants to reproduce
> the same results on every platform painfully difficult. It also makes
> for unit tests that are either fragile or much more complicated than
> they would have to be, could the algorithm be relied upon. So as far as
> I am concerned, I would be very happy if the algorithm were defined.
>
> Mandating reliable results for the discrete distributions should be
> doable; the real-valued ones would be much more challenging I suppose.
> Even if the same underlying floating-point implementation could be
> assumed. So I'm not sure whether that's realistic to happen.
>
> Anyway, I don't think that the complexity of adding a parameter would be
> warranted, though. That would mean that all standard library
> implementations would have to implement all variants. I'm not aware of
> any actual use cases where having this flexibility would be beneficial
> for a user. And if there is a real choice to be made for some current or
> future distributions, different types like, say, a hypothetical
> fast_triangular_distribution and correct_triangular_distribution could
> always be used instead.
>
> Finally, I'm worried that defining the algorithms now (even and
> especially for those distributions where doing so would be
> straight-forward) would cause many unhappy users who have come to rely
> upon the implementation-defined behavior of their current standard
> library...
>
>
> On 5/10/21 1:13 PM, RICHINGS James via Std-Proposals wrote:
>
> Dear Std-Proposals,
>
> When developing programs for scientific applications the numerical reproducibility of code is of paramount importance if reliable results are to be obtained.
>
> One of the key sources of error is in the sequence in which distributions of random numbers are generated.
>
> By way of example, currently the standard does not require that a given implementation reproduce the same sequence of normally distributed random numbers as it leaves the implementer freedom to choose the algorithm by which to implement the normal distribution. However, this leave the order in which normally distributed random numbers are generated out of the control of the user. This is particularly troubling when currently gcc and llvm have both used the polar method to implement normally distributed random numbers but have decided on different orderings to output the distributed random numbers. This results in a permutation of the even and odd values in a list of normally distributed numbers (1,3,9,4 -> 3,1,4,9). This is not a bug in either implementation as both sequences are valid random numbers, but we cannot control the ordering with the current interface.
>
> This is an issue as this makes it difficult to verify our code against multiple implementations of the standard which we find important when running across multiple machines with different HPC architectures.
>
> Ideally the standard should specify that the order random numbers are returned by a distribution is controllable via a parameter so that the sequence is not dependent on the implementation and that multiple algorithms (if implemented) should be selectable by the user making it possible to always fix the order to a desired convention. This would allow the existing default behaviour to persist but additional control to be added.
>
> Any thoughts on this issue are welcome.
>
> Regards,
>
> James Richings
>
> Research Software Engineer
> James Clerk Maxwell Building
> University of Edinburgh
> Edinburgh
> EH9 3FD
>
> The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.
>
>
>


Received on 2021-05-12 15:49:26