C++ Logo

std-proposals

Advanced search

Re: [std-proposals] solution proposal for Issue 2524: generate_canonical can occasionally return 1.0

From: Lénárd Szolnoki <cpp_at_[hidden]>
Date: Mon, 8 Dec 2025 00:34:26 +0000
On 05/12/2025 18:25, Jonathan Wakely via Std-Proposals wrote:
>
>
> On Fri, 5 Dec 2025 at 14:34, Jonathan Wakely <cxx_at_[hidden] <mailto:cxx_at_[hidden]>> wrote:
>
>
>
> On Fri, 5 Dec 2025 at 14:06, Juan Lucas Rey <juanlucasrey_at_[hidden]
> <mailto:juanlucasrey_at_[hidden]>> wrote:
>
> "You keep saying "canonical_distribution" ... do you mean
> std::generate_canonical, or one of the random number distributions in
> <random>, or some non-standard random number distribution in your own
> code?"
>
> I mean std::generate_canonical, yes.
>
> "But your proposal returns negative numbers for those 10 values. It's
> highly debatable whether that is a "better distribution" given that
> those values are outside the [0,1) range!
> Your results might be more uniformly distributed over some range, but
> it's a different range!"
>
> My suggestion to use "generate_canonical_centered" inside
> "std::exponential_distribution" (as proposed in the sample file I
> sent) does return 10 different values for the extremes. what libstd++
> is proposing is to return the same value for those 10 cases. As
> explained before, the purpose here is to have that different range,
> containing better precision, especially in the right limit, being
> properly handled in the other distributions.
>
>
>
> Your proposal needs to say that then. Because currently it says:
>
> *0.4 3. Proposal*
> Add the following to <random>:
> namespace std {
> template<class RealType = double, int bits, class URNG>
> RealType generate_canonical_centered(URNG& g);
> }
>
> Is that it? That's the whole proposal?!
> Apparently not, apparently you want to change std::exponential_distribution too. What
> about the other 20+ places that use std::generate_canonical?
>
> So in summary:
>
> You should explain that where P0952R2 says "In particular, code that depends on a
> specific sequence of results from repeated invocations, *or on a particular number of
> calls to the URBG argument*, will be broken" that it's the second part (in bold) that
> is a problem for your. Based on your initial PDF proposal there is no clue whether the
> compatibility you're talking about is the exact sequence of values returned, or the
> number of invocations of the URBG. The word "discard" doesn't even appear in the proposal.
>
> Your abstract says "without altering existing behavior". I think you mean "without
> altering the C++23 behaviour", but you should be clear about what you mean by
> "existing". P0952R2 is already part of the C++26 draft. Assuming you mean "without
> changing the C++23 behaviour", how does proposing a completely different function
> help? The P0952R2 changes would still be in C++26, and so that's still a change from
> C++23. How does a different function with different behaviour undo the changes to
> std::generate_canonical?!
>
> You need to be clear about what you're actually proposing, and the impact on
> implementations (they would need to replace some or all internal uses of
> std::generate_canonical with your new function, and adjust to deal with a completely
> different output range?)
>
> Currently the proposal is vague and contradictory and confusing.
>
>
> Finally, I don't see how making more use of the increased precision near zero actually
> helps. The purpose of std::generate_canonical is to produce values uniformly distributed
> in the range [0,1). Producing more values close to zero because there is a higher density
> of representable values there does not meet the contract.
>
> If I have five buckets of different sizes, 5L, 3L, 2L, and 1L, and I have to evenly
> distribute 4L of water into those buckets, putting more in the 5L bucket because it has
> more capacity does not make sense. There should be exactly 1L in each bucket. This seems
> analogous to saying that we should return more results near zero, because there are more
> representable values near zero.
>
> Ideally, we want 25% of all results to be in the interval [0, 0.25) and 25% of all results
> to be in the interval [0.75, 1.0). We don't want there to be more than 25% of results in
> the first interval just because it's a bigger bucket that can represent more distinct
> values, due to the higher precision.

Using all the representable values does not mean that each representable value must have
the same probability.

For whatever reason it was decided that generate_canonical should limit the precision to
std::numeric_limits<RealType>::digits. If it wasn't and results were simply rounded down
where necessary (which doesn't necessarily need to use floating point environment rounding
modes), then it could produce all representable values in [0.25, 0.5) as well as on [0.5,
1.), but each specific value would have half the probability on the former range, so your
bucketing reasoning wouldn't break.

>
> And really finally finally, one of the P0952R2 authors reminded me that the standard
> already has a note giving you the guarantee that you want:
> https://eel.is/c++draft/rand.util.canonical#note-1 <https://eel.is/c++draft/
> rand.util.canonical#note-1>
> When the full range of the URBG is (2^N - 1) for any N, there is never a need to discard
> any values from the URBG. So if you are only concerned with the additional discards being
> done, just make sure your URBG is sensible. If your URBG returns any value in the range
> [0,UINT_MAX) or [0,ULLONG_MAX) then there will be no discarded values.
>
>
>
> --
> Std-Proposals mailing list
> Std-Proposals_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals

Received on 2025-12-08 00:34:40