C++ Logo

std-proposals

Advanced search

Re: [std-proposals] solution proposal for Issue 2524: generate_canonical can occasionally return 1.0

From: Jonathan Wakely <cxx_at_[hidden]>
Date: Fri, 5 Dec 2025 18:25:36 +0000
On Fri, 5 Dec 2025 at 14:34, Jonathan Wakely <cxx_at_[hidden]> wrote:

>
>
> On Fri, 5 Dec 2025 at 14:06, Juan Lucas Rey <juanlucasrey_at_[hidden]>
> wrote:
>
>> "You keep saying "canonical_distribution" ... do you mean
>> std::generate_canonical, or one of the random number distributions in
>> <random>, or some non-standard random number distribution in your own
>> code?"
>>
>> I mean std::generate_canonical, yes.
>>
>> "But your proposal returns negative numbers for those 10 values. It's
>> highly debatable whether that is a "better distribution" given that
>> those values are outside the [0,1) range!
>> Your results might be more uniformly distributed over some range, but
>> it's a different range!"
>>
>> My suggestion to use "generate_canonical_centered" inside
>> "std::exponential_distribution" (as proposed in the sample file I
>> sent) does return 10 different values for the extremes. what libstd++
>> is proposing is to return the same value for those 10 cases. As
>> explained before, the purpose here is to have that different range,
>> containing better precision, especially in the right limit, being
>> properly handled in the other distributions.
>>
>
>
> Your proposal needs to say that then. Because currently it says:
>
> *0.4 3. Proposal*
> Add the following to <random>:
> namespace std {
> template<class RealType = double, int bits, class URNG>
> RealType generate_canonical_centered(URNG& g);
> }
>
> Is that it? That's the whole proposal?!
> Apparently not, apparently you want to change
> std::exponential_distribution too. What about the other 20+ places that use
> std::generate_canonical?
>
> So in summary:
>
> You should explain that where P0952R2 says "In particular, code that
> depends on a specific sequence of results from repeated invocations, *or
> on a particular number of calls to the URBG argument*, will be broken"
> that it's the second part (in bold) that is a problem for your. Based on
> your initial PDF proposal there is no clue whether the compatibility you're
> talking about is the exact sequence of values returned, or the number of
> invocations of the URBG. The word "discard" doesn't even appear in the
> proposal.
>
> Your abstract says "without altering existing behavior". I think you mean
> "without altering the C++23 behaviour", but you should be clear about what
> you mean by "existing". P0952R2 is already part of the C++26 draft.
> Assuming you mean "without changing the C++23 behaviour", how does
> proposing a completely different function help? The P0952R2 changes would
> still be in C++26, and so that's still a change from C++23. How does a
> different function with different behaviour undo the changes to
> std::generate_canonical?!
>
> You need to be clear about what you're actually proposing, and the impact
> on implementations (they would need to replace some or all internal uses of
> std::generate_canonical with your new function, and adjust to deal with a
> completely different output range?)
>
>
> Currently the proposal is vague and contradictory and confusing.
>

Finally, I don't see how making more use of the increased precision near
zero actually helps. The purpose of std::generate_canonical is to produce
values uniformly distributed in the range [0,1). Producing more values
close to zero because there is a higher density of representable values
there does not meet the contract.

If I have five buckets of different sizes, 5L, 3L, 2L, and 1L, and I have
to evenly distribute 4L of water into those buckets, putting more in the 5L
bucket because it has more capacity does not make sense. There should be
exactly 1L in each bucket. This seems analogous to saying that we should
return more results near zero, because there are more representable values
near zero.

Ideally, we want 25% of all results to be in the interval [0, 0.25) and 25%
of all results to be in the interval [0.75, 1.0). We don't want there to be
more than 25% of results in the first interval just because it's a bigger
bucket that can represent more distinct values, due to the higher precision.

And really finally finally, one of the P0952R2 authors reminded me that the
standard already has a note giving you the guarantee that you want:
https://eel.is/c++draft/rand.util.canonical#note-1
When the full range of the URBG is (2^N - 1) for any N, there is never a
need to discard any values from the URBG. So if you are only concerned with
the additional discards being done, just make sure your URBG is sensible.
If your URBG returns any value in the range [0,UINT_MAX) or [0,ULLONG_MAX)
then there will be no discarded values.

Received on 2025-12-05 18:25:54