C++ Logo

std-proposals

Advanced search

Re: [std-proposals] solution proposal for Issue 2524: generate_canonical can occasionally return 1.0

From: Jonathan Wakely <cxx_at_[hidden]>
Date: Sun, 7 Dec 2025 10:22:33 +0000
On Sun, 7 Dec 2025, 08:49 Lénárd Szolnoki, <cpp_at_[hidden]> wrote:

>
>
> On 05/12/2025 18:25, Jonathan Wakely via Std-Proposals wrote:
> >
> >
> > On Fri, 5 Dec 2025 at 14:34, Jonathan Wakely <cxx_at_[hidden] <mailto:
> cxx_at_[hidden]>> wrote:
> >
> >
> >
> > On Fri, 5 Dec 2025 at 14:06, Juan Lucas Rey <juanlucasrey_at_[hidden]
> > <mailto:juanlucasrey_at_[hidden]>> wrote:
> >
> > "You keep saying "canonical_distribution" ... do you mean
> > std::generate_canonical, or one of the random number
> distributions in
> > <random>, or some non-standard random number distribution in
> your own
> > code?"
> >
> > I mean std::generate_canonical, yes.
> >
> > "But your proposal returns negative numbers for those 10 values.
> It's
> > highly debatable whether that is a "better distribution" given
> that
> > those values are outside the [0,1) range!
> > Your results might be more uniformly distributed over some
> range, but
> > it's a different range!"
> >
> > My suggestion to use "generate_canonical_centered" inside
> > "std::exponential_distribution" (as proposed in the sample file I
> > sent) does return 10 different values for the extremes. what
> libstd++
> > is proposing is to return the same value for those 10 cases. As
> > explained before, the purpose here is to have that different
> range,
> > containing better precision, especially in the right limit, being
> > properly handled in the other distributions.
> >
> >
> >
> > Your proposal needs to say that then. Because currently it says:
> >
> > *0.4 3. Proposal*
> > Add the following to <random>:
> > namespace std {
> > template<class RealType = double, int bits, class URNG>
> > RealType generate_canonical_centered(URNG& g);
> > }
> >
> > Is that it? That's the whole proposal?!
> > Apparently not, apparently you want to change
> std::exponential_distribution too. What
> > about the other 20+ places that use std::generate_canonical?
> >
> > So in summary:
> >
> > You should explain that where P0952R2 says "In particular, code that
> depends on a
> > specific sequence of results from repeated invocations, *or on a
> particular number of
> > calls to the URBG argument*, will be broken" that it's the second
> part (in bold) that
> > is a problem for your. Based on your initial PDF proposal there is
> no clue whether the
> > compatibility you're talking about is the exact sequence of values
> returned, or the
> > number of invocations of the URBG. The word "discard" doesn't even
> appear in the proposal.
> >
> > Your abstract says "without altering existing behavior". I think you
> mean "without
> > altering the C++23 behaviour", but you should be clear about what
> you mean by
> > "existing". P0952R2 is already part of the C++26 draft. Assuming you
> mean "without
> > changing the C++23 behaviour", how does proposing a completely
> different function
> > help? The P0952R2 changes would still be in C++26, and so that's
> still a change from
> > C++23. How does a different function with different behaviour undo
> the changes to
> > std::generate_canonical?!
> >
> > You need to be clear about what you're actually proposing, and the
> impact on
> > implementations (they would need to replace some or all internal
> uses of
> > std::generate_canonical with your new function, and adjust to deal
> with a completely
> > different output range?)
> >
> > Currently the proposal is vague and contradictory and confusing.
> >
> >
> > Finally, I don't see how making more use of the increased precision near
> zero actually
> > helps. The purpose of std::generate_canonical is to produce values
> uniformly distributed
> > in the range [0,1). Producing more values close to zero because there is
> a higher density
> > of representable values there does not meet the contract.
>
> The way it helps is that the way the centered distribution is sliced and
> rearranged, it
> produces a uniform distribution on (0, 1], and then the produced value is
> used directly as
> -log(u).
>
> The way libstdc++ (and I assume other implementations as well) do it, is
> that it produces
> a uniform distribution on [0, 1), and then use it as -log(1-u). 1-u has
> reduced precision
> close to 0 (in fact on the whole range of (0, 0.5)). Using 1-u is
> effectively equivalent
> to generating a fixed-point number between 0 and 1 with 24 bits of
> mantissa in terms of
> precision (assuming float).
>
> If we deem the resulting exponential distribution acceptable then this
> algorithm is quite
> wasteful in how it uses the random generator, as it only uses a fixed 24
> bits of entropy,
> but consumes a lot more bits from the generator to generate the
> intermediate
> generate_canonical.
>


Which could be solved by changing the implementation of
exponential_distribution, right? Nothing says it has to make exactly one
call to generate_canonical<result_type>. For example, it could use at least
double (instead of float) for the call to generate_canonical. Or use
something completely different from generate_canonical, like the centred
function in the proposal.

The proposal seems quite confused (or at least unclear) about what problem
is trying to solve. Is it about exponential distribution (and *only*
exponential distribution?) being low quality when used with float? Or is it
about the reproducibility of values from the C++23 spec for generate
canonical?

If the proposal is to make progress, I'd like to see a clear problem
statement, and a more detailed description of what is actually being
proposed and how it solves the problem.

e.g. "Using generate_canonical<float> for exponential_distribution<float>
is bad because ..." and/or "the P0952R2 changes to std::generate_canonical
are a problem because ..." and/or something else. Not just vague references
to backwards compatibility and brief mentions of subnormals which are
irrelevant to the P0952R2 changes.

A proposal should do more than show a single function declaration with no
description of its semantics and no discussion of what other changes
implementations might need to make.

Received on 2025-12-07 10:22:50