C++ Logo

std-proposals

Advanced search

Re: [std-proposals] solution proposal for Issue 2524: generate_canonical can occasionally return 1.0

From: Jonathan Wakely <cxx_at_[hidden]>
Date: Fri, 5 Dec 2025 13:37:37 +0000
On Fri, 5 Dec 2025 at 13:02, Juan Lucas Rey <juanlucasrey_at_[hidden]> wrote:

> Hi,
>
> The main motivation here is to be able to remove the issue LWG 2524
> without creating additional issues.
> In particular avoiding the issue mentioned in p0952r2:
> "code that depends on a specific sequence of results from repeated
> invocations, or on a particular number of calls to the URBG argument,
> will be broken."
>

OK, but it's too late for that. LWG 2524 has been resolved by accepting
P0952 into C++26. If you want to undo that, the time for that was about 3
months ago when the C++26 draft was sent out for national body ballot
(which your employer took part in).



>
> why? I work daily with Monte Carlo simulations and if I want to
> parallelise simulations, I have to be able to discard RNGs properly. A
>

What do you mean by "discard RNGs"? Deterministically discard values
produced from RNGs?


> lot of work has been dedicated to make those discards fast, but with
> this proposals I would now have to actually produce the numbers
> produced by the canonical_distribution and check them one by one to
> make sure there has not been any additional discard.
>

This motivation should be in the proposal.

But why do you need to avoid additional discards? So that using one random
engine across multiple threads is deterministic, with each thread invoking
the random engine exactly N times? Or something else?



>
> What I suggest is up to discussion.
>

The proposal should say that. It's currently unclear.



> -maybe this centered_canonical_distribution is to be used exclusively
> inside other distributions like std::exponential_distribution
> exclusively to avoid LWG 2524, while not breaking the specific
> sequence
>

That could be achieved the same way libstdc++ does it: when
generate_canonical would return 1.0, just return
std::nextafter(RealType(1), RealType(0)) instead. We don't need a
completely different function that doesn't return values in the range [0,1)
and which needs every caller to be adjusted for the new output range.



> -maybe we want to expose this "centered_canonical_distribution " as a
> generalisation of canonical_distribution (canonical_distribution would
> be a specialisation of centered_canonical_distribution setting the
> limit parameter to 1)
> -the reason why I introduced this limit parameter today is to show how
> we can use this approach to either prioritise backward compatibility
> or a more elegant symmetric distribution. That's up for discussion and
> I felt like giving that additional flexibility would make this a more
> realistic proposal.
>

Then the proposal should say that. You showed a function template
declaration in the proposal and said that's what you're proposing. If
you're actually proposing "maybe this, or maybe something similar with
other properties" then you need to explain that in the proposal.



> -the symmetry around 0 does have some nice properties, but I think
> that is of second order compared to fixing LWG 2524 and avoiding
> issues created by p0952r2
>

Then the proposal needs to be clear that "the issues created by P0952R2"
are what you're trying to solve.


> - bear in mind that relatively recent trend of using quantisation in
> TPUs make this more relevant: using floats with less precision means
> that hitting a value of 1 using canonical_distribution becomes more
> and more likely. being able to benefit from subnormal numbers on both
> sides of 0 will definitely make this issue go away, and would allow to
> use 32 or 64 or even 128 bit unsigned int generators to later generate
> float16, for example, safely.
>
> I am new to this process so if there is any meeting or forum to
> discuss this ( besides these emails) I would be more than happy to be
> a part of it.
>

The discussions all happened in 2023 and could have been re-opened any time
up to about 3 months ago :-(


>
> Lucas
>
> On Fri, 5 Dec 2025 at 12:32, Jonathan Wakely <cxx_at_[hidden]> wrote:
> >
> >
> >
> > On Fri, 5 Dec 2025 at 12:17, Jonathan Wakely <cxx_at_[hidden]> wrote:
> >>
> >> On Fri, 5 Dec 2025 at 11:03, Juan Lucas Rey wrote:
> >>>
> >>> Hello,
> >>>
> >>> Here is an example code that shows:
> >>>
> >>> -the original issue reproduced as shown in
> >>> https://cplusplus.github.io/LWG/issue2524
> >>
> >>
> >> For anybody else trying it, the original problem can be reproduced with
> libc++ but not libstdc++, because libstdc++ has a workaround in
> std::generate_canonical which returns the largest value of RealType which
> is less than RealType(1).
> >>
> >>>
> >>>
> >>> - std2::exponential_distribution, using internally
> >>> std2::generate_canonical_centered and NOT showing the issue
> >>> -other values are the same
> >>>
> >>> I have added a double template parameter "limit" to
> >>> "std2::generate_canonical_centered". to allow maximum backwards
> >>> compatibility, that value should be close to 1.
> >>> setting that value to 0.5 is maybe more elegant, but less backward
> compatible.
> >>
> >>
> >> So it's just a hardcoded "use a different calculation if the result
> would be greater than LIMIT" branch inside generate_canonical_centred. OK.
> But isn't that alternative path taken for a tiny fraction of the values
> from the URNG? Something like 1e-8 of the full range? So it's not
> symmetrical around zero? Instead it seems to return values in the range
> [-1e-08, 1.0).
> >>
> >> I thought the proposal was for a symmetric distribution around zero? So
> I think I'm failing to understand the proposal.
> >
> >
> >
> > Also, is that 'limit' parameter actually part of the proposal? Because
> it wasn't shown in the original PDF you sent, but it seems to be necessary
> to offer the backwards compatibility which is mentioned in the PDF and in
> your original email.
> >
> >
> >>
> >>
> >> I'm not sure how very occasionally returning values below zero, i.e.
> below the expected range of [0,1), is better than very occasionally
> returning 1.0, i.e. above the expected range.
> >>
> >> So again, I think the idea of an alternative function that is
> symmetrical around zero is interesting. But I don't understand the
> backwards compatibility argument at all. It seems that your solution
> returns the same values as the old spec for std::generate_canonical in most
> cases, but just has a different form of failure in the problematic cases.
> Sure, we can adjust std::exponential_distribution to cope with the "bad"
> results, but std::generate_canonical_centred with a limit of 0.999999 is
> not centred, and fails to meet the original contract for
> std::generate_canonical.
> >>
> >> Unless I'm misunderstanding something, I think this proposal would make
> more sense if you dropped the backwards compatibility claims, and just
> demonstrated why a symmetric-around-zero distribution has valuable
> properties. But you should also show how it would be used by the rest of
> the library. You've shown how std::exponential_distribution can be adapted
> to work with (-0.5,0.5] but what about std::uniform_real_distribution and
> std::bernoulli_distribution, etc. I count 26 uses of generate_canonical in
> libstdc++'s <random>, should they all be changed to use
> generate_canonical_centred? Wouldn't that mean an extra branch and extra
> logic in every one of them, to handle the negative half of the range? Or
> maybe there's no benefit to some distributions and they should continue to
> use std::generate_canonical? I think the proposal needs to explain the
> extent of the changes that would result from introducing this new function.
> >
> >
> >
> > In other words, if the proposal is to allow
> std::generate_canonical_centred to be customized by callers (but maybe
> default to 0.5 so it's actually centred around zero?), how would that work
> for all the existing callers inside the std::lib? If the idea is to allow
> users to select an algorithm that's backwards compatible with the old spec
> (so that re-running simulations produces the same results as with the old
> spec), how would users select that when most uses of
> std::generate_canonical are hidden inside std::xxx_distribution and are not
> exposed to users directly?
> >
> > tl;dr who are the expected users of this new function, and how exactly
> would it benefit them?
> >
> >
> >
> >>
> >>
> >> Or maybe I'm just misunderstanding something ... in which case that
> suggests something in the proposal needs to be clarified.
> >>
> >>
>

Received on 2025-12-05 13:37:55