ISOCPP std-proposals List: Re: [std-proposals] solution proposal for Issue 2524: generate

From: Sebastian Wittmeier <wittmeier_at_[hidden]>
Date: Sun, 7 Dec 2025 15:50:25 +0100

I meant something like val = generate_canonical(); if (val==0) val=1; Or would the remaining subnormal numbers violate the lower b bound of the interval as they sometimes are rounded to 0? -----Ursprüngliche Nachricht----- Von:Lénárd Szolnoki <cpp_at_[hidden]> Gesendet:So 07.12.2025 12:56 Betreff:Re: [std-proposals] solution proposal for Issue 2524: generate_canonical can occasionally return 1.0 An:std-proposals_at_[hidden]; CC:Sebastian Wittmeier <wittmeier_at_[hidden]>; On 07/12/2025 10:57, Sebastian Wittmeier via Std-Proposals wrote: > Changing from [0; 1) to (0; 1] and vice versa is simple on the call site, just one > conditional. So the exponential distribution could fix it without a new generate_canonical? Can you elaborate what the simple fix is from changing [0, 1) to (0, 1]? Apart from 1-x, which has the precision problem. > > > -----Ursprüngliche Nachricht----- > *Von:*Lénárd Szolnoki via Std-Proposals <std-proposals_at_[hidden]> > *Gesendet:*So 07.12.2025 09:49 > *Betreff:*Re: [std-proposals] solution proposal for Issue 2524: generate_canonical > can occasionally return 1.0 > *An:*std-proposals_at_[hidden]; Jonathan Wakely <cxx_at_[hidden]>; > *CC:*Lénárd Szolnoki <cpp_at_[hidden]>; pnash44_at_[hidden]; Juan Lucas Rey > <juanlucasrey_at_[hidden]>; > > > On 05/12/2025 18:25, Jonathan Wakely via Std-Proposals wrote: > > > > > > On Fri, 5 Dec 2025 at 14:34, Jonathan Wakely <cxx_at_[hidden] > <mailto:cxx_at_[hidden]>> wrote: > > > > > > > > On Fri, 5 Dec 2025 at 14:06, Juan Lucas Rey <juanlucasrey_at_[hidden] > > <mailto:juanlucasrey_at_[hidden]>> wrote: > > > > "You keep saying "canonical_distribution" ... do you mean > > std::generate_canonical, or one of the random number distributions in > > <random>, or some non-standard random number distribution in your own > > code?" > > > > I mean std::generate_canonical, yes. > > > > "But your proposal returns negative numbers for those 10 values. It's > > highly debatable whether that is a "better distribution" given that > > those values are outside the [0,1) range! > > Your results might be more uniformly distributed over some range, but > > it's a different range!" > > > > My suggestion to use "generate_canonical_centered" inside > > "std::exponential_distribution" (as proposed in the sample file I > > sent) does return 10 different values for the extremes. what libstd++ > > is proposing is to return the same value for those 10 cases. As > > explained before, the purpose here is to have that different range, > > containing better precision, especially in the right limit, being > > properly handled in the other distributions. > > > > > > > > Your proposal needs to say that then. Because currently it says: > > > > *0.4 3. Proposal* > > Add the following to <random>: > > namespace std { > > template<class RealType = double, int bits, class URNG> > > RealType generate_canonical_centered(URNG& g); > > } > > > > Is that it? That's the whole proposal?! > > Apparently not, apparently you want to change std::exponential_distribution > too. What > > about the other 20+ places that use std::generate_canonical? > > > > So in summary: > > > > You should explain that where P0952R2 says "In particular, code that depends on a > > specific sequence of results from repeated invocations, *or on a particular > number of > > calls to the URBG argument*, will be broken" that it's the second part (in > bold) that > > is a problem for your. Based on your initial PDF proposal there is no clue > whether the > > compatibility you're talking about is the exact sequence of values returned, or the > > number of invocations of the URBG. The word "discard" doesn't even appear in > the proposal. > > > > Your abstract says "without altering existing behavior". I think you mean "without > > altering the C++23 behaviour", but you should be clear about what you mean by > > "existing". P0952R2 is already part of the C++26 draft. Assuming you mean "without > > changing the C++23 behaviour", how does proposing a completely different function > > help? The P0952R2 changes would still be in C++26, and so that's still a change > from > > C++23. How does a different function with different behaviour undo the changes to > > std::generate_canonical?! > > > > You need to be clear about what you're actually proposing, and the impact on > > implementations (they would need to replace some or all internal uses of > > std::generate_canonical with your new function, and adjust to deal with a > completely > > different output range?) > > > > Currently the proposal is vague and contradictory and confusing. > > > > > > Finally, I don't see how making more use of the increased precision near zero actually > > helps. The purpose of std::generate_canonical is to produce values > uniformly distributed > > in the range [0,1). Producing more values close to zero because there is a higher > density > > of representable values there does not meet the contract. > > The way it helps is that the way the centered distribution is sliced and rearranged, it > produces a uniform distribution on (0, 1], and then the produced value is used > directly as > -log(u). > > The way libstdc++ (and I assume other implementations as well) do it, is that it produces > a uniform distribution on [0, 1), and then use it as -log(1-u). 1-u has reduced precision > close to 0 (in fact on the whole range of (0, 0.5)). Using 1-u is effectively equivalent > to generating a fixed-point number between 0 and 1 with 24 bits of mantissa in terms of > precision (assuming float). > > If we deem the resulting exponential distribution acceptable then this algorithm is quite > wasteful in how it uses the random generator, as it only uses a fixed 24 bits of entropy, > but consumes a lot more bits from the generator to generate the intermediate > generate_canonical. > > > > > If I have five buckets of different sizes, 5L, 3L, 2L, and 1L, and I have to evenly > > distribute 4L of water into those buckets, putting more in the 5L bucket because it > has > > more capacity does not make sense. There should be exactly 1L in each bucket. This > seems > > analogous to saying that we should return more results near zero, because there are > more > > representable values near zero. > > > > Ideally, we want 25% of all results to be in the interval [0, 0.25) and 25% of all > results > > to be in the interval [0.75, 1.0). We don't want there to be more than 25% of > results in > > the first interval just because it's a bigger bucket that can represent more distinct > > values, due to the higher precision. > > > > And really finally finally, one of the P0952R2 authors reminded me that the standard > > already has a note giving you the guarantee that you want: > > https://eel.is/c++draft/rand.util.canonical#note-1 <https://eel.is/c++draft/ > > rand.util.canonical#note-1> > > When the full range of the URBG is (2^N - 1) for any N, there is never a need to > discard > > any values from the URBG. So if you are only concerned with the additional discards > being > > done, just make sure your URBG is sensible. If your URBG returns any value in the > range > > [0,UINT_MAX) or [0,ULLONG_MAX) then there will be no discarded values. > > > > > > > > -- > > Std-Proposals mailing list > > Std-Proposals_at_[hidden] > > https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals > > -- > Std-Proposals mailing list > Std-Proposals_at_[hidden] > https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals > > > -- > Std-Proposals mailing list > Std-Proposals_at_[hidden] > https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals

Received on 2025-12-07 15:05:03