C++ Logo

std-proposals

Advanced search

Re: [std-proposals] solution proposal for Issue 2524: generate_canonical can occasionally return 1.0

From: Jonathan Wakely <cxx_at_[hidden]>
Date: Fri, 5 Dec 2025 13:51:39 +0000
On Fri, 5 Dec 2025 at 13:02, Juan Lucas Rey <juanlucasrey_at_[hidden]> wrote:

> Hi,
>
> The main motivation here is to be able to remove the issue LWG 2524
> without creating additional issues.
> In particular avoiding the issue mentioned in p0952r2:
> "code that depends on a specific sequence of results from repeated
> invocations, or on a particular number of calls to the URBG argument,
> will be broken."
>
> why? I work daily with Monte Carlo simulations and if I want to
> parallelise simulations, I have to be able to discard RNGs properly. A
> lot of work has been dedicated to make those discards fast, but with
> this proposals I would now have to actually produce the numbers
> produced by the canonical_distribution and check them one by one to
> make sure there has not been any additional discard.
>

It should be noted that it is not specified how or when any of the random
distributions in <random> use std::generate_canonical. Nothing says that
std::exponential_distribution can't call std::generate_canonical twice, or
any number of times. Ever since C++11, the random number distribution
requirements ([rand.req.dist]) say that d(g) does "amortized
constant number of invocations of g". So it doesn't have to be constant, it
can be k for 99.9% of invocations, and then k+1 for 0.1% of invocations.

So if your goal is to ensure that calling d(g) N times will invoke g()
exactly kN times, that is already not guaranteed, and never was guaranteed.

If you're using std::generate_canonical yourself directly, not via one of
the distributions in <random>, then you also need to explain that in your
proposal. If you're using it directly then you can indeed observe a change
in the contract of std::generate_canonical. But if you're only using it
indirectly via distributions, the contracts of those distributions never
guaranteed an exact number of invocations of the URBG.



>
> What I suggest is up to discussion.
> -maybe this centered_canonical_distribution is to be used exclusively
> inside other distributions like std::exponential_distribution
> exclusively to avoid LWG 2524, while not breaking the specific
> sequence
> -maybe we want to expose this "centered_canonical_distribution " as a
> generalisation of canonical_distribution (canonical_distribution would
> be a specialisation of centered_canonical_distribution setting the
> limit parameter to 1)
> -the reason why I introduced this limit parameter today is to show how
> we can use this approach to either prioritise backward compatibility
> or a more elegant symmetric distribution. That's up for discussion and
> I felt like giving that additional flexibility would make this a more
> realistic proposal.
> -the symmetry around 0 does have some nice properties, but I think
> that is of second order compared to fixing LWG 2524 and avoiding
> issues created by p0952r2
> - bear in mind that relatively recent trend of using quantisation in
> TPUs make this more relevant: using floats with less precision means
> that hitting a value of 1 using canonical_distribution becomes more
> and more likely. being able to benefit from subnormal numbers on both
> sides of 0 will definitely make this issue go away, and would allow to
> use 32 or 64 or even 128 bit unsigned int generators to later generate
> float16, for example, safely.
>
> I am new to this process so if there is any meeting or forum to
> discuss this ( besides these emails) I would be more than happy to be
> a part of it.
>
> Lucas
>
> On Fri, 5 Dec 2025 at 12:32, Jonathan Wakely <cxx_at_[hidden]> wrote:
> >
> >
> >
> > On Fri, 5 Dec 2025 at 12:17, Jonathan Wakely <cxx_at_[hidden]> wrote:
> >>
> >> On Fri, 5 Dec 2025 at 11:03, Juan Lucas Rey wrote:
> >>>
> >>> Hello,
> >>>
> >>> Here is an example code that shows:
> >>>
> >>> -the original issue reproduced as shown in
> >>> https://cplusplus.github.io/LWG/issue2524
> >>
> >>
> >> For anybody else trying it, the original problem can be reproduced with
> libc++ but not libstdc++, because libstdc++ has a workaround in
> std::generate_canonical which returns the largest value of RealType which
> is less than RealType(1).
> >>
> >>>
> >>>
> >>> - std2::exponential_distribution, using internally
> >>> std2::generate_canonical_centered and NOT showing the issue
> >>> -other values are the same
> >>>
> >>> I have added a double template parameter "limit" to
> >>> "std2::generate_canonical_centered". to allow maximum backwards
> >>> compatibility, that value should be close to 1.
> >>> setting that value to 0.5 is maybe more elegant, but less backward
> compatible.
> >>
> >>
> >> So it's just a hardcoded "use a different calculation if the result
> would be greater than LIMIT" branch inside generate_canonical_centred. OK.
> But isn't that alternative path taken for a tiny fraction of the values
> from the URNG? Something like 1e-8 of the full range? So it's not
> symmetrical around zero? Instead it seems to return values in the range
> [-1e-08, 1.0).
> >>
> >> I thought the proposal was for a symmetric distribution around zero? So
> I think I'm failing to understand the proposal.
> >
> >
> >
> > Also, is that 'limit' parameter actually part of the proposal? Because
> it wasn't shown in the original PDF you sent, but it seems to be necessary
> to offer the backwards compatibility which is mentioned in the PDF and in
> your original email.
> >
> >
> >>
> >>
> >> I'm not sure how very occasionally returning values below zero, i.e.
> below the expected range of [0,1), is better than very occasionally
> returning 1.0, i.e. above the expected range.
> >>
> >> So again, I think the idea of an alternative function that is
> symmetrical around zero is interesting. But I don't understand the
> backwards compatibility argument at all. It seems that your solution
> returns the same values as the old spec for std::generate_canonical in most
> cases, but just has a different form of failure in the problematic cases.
> Sure, we can adjust std::exponential_distribution to cope with the "bad"
> results, but std::generate_canonical_centred with a limit of 0.999999 is
> not centred, and fails to meet the original contract for
> std::generate_canonical.
> >>
> >> Unless I'm misunderstanding something, I think this proposal would make
> more sense if you dropped the backwards compatibility claims, and just
> demonstrated why a symmetric-around-zero distribution has valuable
> properties. But you should also show how it would be used by the rest of
> the library. You've shown how std::exponential_distribution can be adapted
> to work with (-0.5,0.5] but what about std::uniform_real_distribution and
> std::bernoulli_distribution, etc. I count 26 uses of generate_canonical in
> libstdc++'s <random>, should they all be changed to use
> generate_canonical_centred? Wouldn't that mean an extra branch and extra
> logic in every one of them, to handle the negative half of the range? Or
> maybe there's no benefit to some distributions and they should continue to
> use std::generate_canonical? I think the proposal needs to explain the
> extent of the changes that would result from introducing this new function.
> >
> >
> >
> > In other words, if the proposal is to allow
> std::generate_canonical_centred to be customized by callers (but maybe
> default to 0.5 so it's actually centred around zero?), how would that work
> for all the existing callers inside the std::lib? If the idea is to allow
> users to select an algorithm that's backwards compatible with the old spec
> (so that re-running simulations produces the same results as with the old
> spec), how would users select that when most uses of
> std::generate_canonical are hidden inside std::xxx_distribution and are not
> exposed to users directly?
> >
> > tl;dr who are the expected users of this new function, and how exactly
> would it benefit them?
> >
> >
> >
> >>
> >>
> >> Or maybe I'm just misunderstanding something ... in which case that
> suggests something in the proposal needs to be clarified.
> >>
> >>
>

Received on 2025-12-05 13:51:59