C++ Logo

std-proposals

Advanced search

Re: [std-proposals] solution proposal for Issue 2524: generate_canonical can occasionally return 1.0

From: Sebastian Wittmeier <wittmeier_at_[hidden]>
Date: Sun, 7 Dec 2025 15:50:25 +0100
I meant something like   val = generate_canonical(); if (val==0) val=1;   Or would the remaining subnormal numbers violate the lower b bound of the interval as they sometimes are rounded to 0?  -----Ursprüngliche Nachricht----- Von:Lénárd Szolnoki <cpp_at_[hidden]> Gesendet:So 07.12.2025 12:56 Betreff:Re: [std-proposals] solution proposal for Issue 2524: generate_canonical can occasionally return 1.0 An:std-proposals_at_[hidden]; CC:Sebastian Wittmeier <wittmeier_at_[hidden]>; On 07/12/2025 10:57, Sebastian Wittmeier via Std-Proposals wrote: > Changing from [0; 1) to (0; 1] and vice versa is simple on the call site, just one > conditional. So the exponential distribution could fix it without a new generate_canonical? Can you elaborate what the simple fix is from changing [0, 1) to (0, 1]? Apart from 1-x, which has the precision problem. > > >     -----Ursprüngliche Nachricht----- >     *Von:*Lénárd Szolnoki via Std-Proposals <std-proposals_at_[hidden]> >     *Gesendet:*So 07.12.2025 09:49 >     *Betreff:*Re: [std-proposals] solution proposal for Issue 2524: generate_canonical >     can occasionally return 1.0 >     *An:*std-proposals_at_[hidden]; Jonathan Wakely <cxx_at_[hidden]>; >     *CC:*Lénárd Szolnoki <cpp_at_[hidden]>; pnash44_at_[hidden]; Juan Lucas Rey >     <juanlucasrey_at_[hidden]>; > > >     On 05/12/2025 18:25, Jonathan Wakely via Std-Proposals wrote: >      > >      > >      > On Fri, 5 Dec 2025 at 14:34, Jonathan Wakely <cxx_at_[hidden] >     <mailto:cxx_at_[hidden]>> wrote: >      > >      > >      > >      >     On Fri, 5 Dec 2025 at 14:06, Juan Lucas Rey <juanlucasrey_at_[hidden] >      >     <mailto:juanlucasrey_at_[hidden]>> wrote: >      > >      >         "You keep saying "canonical_distribution" ... do you mean >      >         std::generate_canonical, or one of the random number distributions in >      >         <random>, or some non-standard random number distribution in your own >      >         code?" >      > >      >         I mean std::generate_canonical, yes. >      > >      >         "But your proposal returns negative numbers for those 10 values. It's >      >         highly debatable whether that is a "better distribution" given that >      >         those values are outside the [0,1) range! >      >         Your results might be more uniformly distributed over some range, but >      >         it's a different range!" >      > >      >         My suggestion to use "generate_canonical_centered" inside >      >         "std::exponential_distribution" (as proposed in the sample file I >      >         sent) does return 10 different values for the extremes. what libstd++ >      >         is proposing is to return the same value for those 10 cases. As >      >         explained before, the purpose here is to have that different range, >      >         containing better precision, especially in the right limit, being >      >         properly handled in the other distributions. >      > >      > >      > >      >     Your proposal needs to say that then. Because currently it says: >      > >      >     *0.4 3. Proposal* >      >     Add the following to <random>: >      >     namespace std { >      >     template<class RealType = double, int bits, class URNG> >      >     RealType generate_canonical_centered(URNG& g); >      >     } >      > >      >     Is that it? That's the whole proposal?! >      >     Apparently not, apparently you want to change std::exponential_distribution >     too. What >      >     about the other 20+ places that use std::generate_canonical? >      > >      >     So in summary: >      > >      >     You should explain that where P0952R2 says "In particular, code that depends on a >      >     specific sequence of results from repeated invocations, *or on a particular >     number of >      >     calls to the URBG argument*, will be broken" that it's the second part (in >     bold) that >      >     is a problem for your. Based on your initial PDF proposal there is no clue >     whether the >      >     compatibility you're talking about is the exact sequence of values returned, or the >      >     number of invocations of the URBG. The word "discard" doesn't even appear in >     the proposal. >      > >      >     Your abstract says "without altering existing behavior". I think you mean "without >      >     altering the C++23 behaviour", but you should be clear about what you mean by >      >     "existing". P0952R2 is already part of the C++26 draft. Assuming you mean "without >      >     changing the C++23 behaviour", how does proposing a completely different function >      >     help? The P0952R2 changes would still be in C++26, and so that's still a change >     from >      >     C++23. How does a different function with different behaviour undo the changes to >      >     std::generate_canonical?! >      > >      >     You need to be clear about what you're actually proposing, and the impact on >      >     implementations (they would need to replace some or all internal uses of >      >     std::generate_canonical with your new function, and adjust to deal with a >     completely >      >     different output range?) >      > >      >     Currently the proposal is vague and contradictory and confusing. >      > >      > >      > Finally, I don't see how making more use of the increased precision near zero actually >      > helps. The purpose of std::generate_canonical is to produce values >     uniformly distributed >      > in the range [0,1). Producing more values close to zero because there is a higher >     density >      > of representable values there does not meet the contract. > >     The way it helps is that the way the centered distribution is sliced and rearranged, it >     produces a uniform distribution on (0, 1], and then the produced value is used >     directly as >     -log(u). > >     The way libstdc++ (and I assume other implementations as well) do it, is that it produces >     a uniform distribution on [0, 1), and then use it as -log(1-u). 1-u has reduced precision >     close to 0 (in fact on the whole range of (0, 0.5)). Using 1-u is effectively equivalent >     to generating a fixed-point number between 0 and 1 with 24 bits of mantissa in terms of >     precision (assuming float). > >     If we deem the resulting exponential distribution acceptable then this algorithm is quite >     wasteful in how it uses the random generator, as it only uses a fixed 24 bits of entropy, >     but consumes a lot more bits from the generator to generate the intermediate >     generate_canonical. > >      > >      > If I have five buckets of different sizes, 5L, 3L, 2L, and 1L, and I have to evenly >      > distribute 4L of water into those buckets, putting more in the 5L bucket because it >     has >      > more capacity does not make sense. There should be exactly 1L in each bucket. This >     seems >      > analogous to saying that we should return more results near zero, because there are >     more >      > representable values near zero. >      > >      > Ideally, we want 25% of all results to be in the interval [0, 0.25) and 25% of all >     results >      > to be in the interval [0.75, 1.0). We don't want there to be more than 25% of >     results in >      > the first interval just because it's a bigger bucket that can represent more distinct >      > values, due to the higher precision. >      > >      > And really finally finally, one of the P0952R2 authors reminded me that the standard >      > already has a note giving you the guarantee that you want: >      > https://eel.is/c++draft/rand.util.canonical#note-1 <https://eel.is/c++draft/ >      > rand.util.canonical#note-1> >      > When the full range of the URBG is (2^N - 1) for any N, there is never a need to >     discard >      > any values from the URBG. So if you are only concerned with the additional discards >     being >      > done, just make sure your URBG is sensible. If your URBG returns any value in the >     range >      > [0,UINT_MAX) or [0,ULLONG_MAX) then there will be no discarded values. >      > >      > >      > >      > -- >      > Std-Proposals mailing list >      > Std-Proposals_at_[hidden] >      > https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals > >     -- >     Std-Proposals mailing list >     Std-Proposals_at_[hidden] >     https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals > > > -- > Std-Proposals mailing list > Std-Proposals_at_[hidden] > https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals

Received on 2025-12-07 15:05:03