Date: Mon, 8 Dec 2025 01:06:28 +0100
Trying to save the idea by reusing the non-P0952 post-processed generate_canonical with subnormal numbers in the case of -log(generate_canonical()):
Internally
-log(generate_canonical()) is used
The case of 0 could be mapped to a number slightly lower or higher than 1 to keep statistical properties (like expected value)?
Numbers around 1 cannot be represented with high accuracy, but
-log(1) = -0
can.
The overall problem with statistical properties is:
Having a discrete distribution with expected value and variance and putting it through a non-linear function like log changes the resulting expected value and variance slightly compared to a continuous distribution.
That effect is there, even when using the fixed-point random numbers of P0952.
If the discreteness is known, the distributions can apply correction factors.
-----Ursprüngliche Nachricht-----
Von:Lénárd Szolnoki <cpp_at_[hidden]>
Gesendet:Mo 08.12.2025 01:00
Betreff:Re: [std-proposals] solution proposal for Issue 2524: generate_canonical can occasionally return 1.0
An:std-proposals_at_[hidden];
CC:Sebastian Wittmeier <wittmeier_at_[hidden]>;
On 07/12/2025 14:50, Sebastian Wittmeier via Std-Proposals wrote:
> I meant something like
>
> val = generate_canonical();
>
> if (val==0) val=1;
>
> Or would the remaining subnormal numbers violate the lower b bound of the interval as they
> sometimes are rounded to 0?
If I used 32 bit IEEE float and made a uniform distribution where every representable
value was possible on the selected range, then for a uniform distribution on (0, 1] rolls
1 with probability 2^-24 or 2^-25, depending on rounding strategy.
For a uniform distribution on [0, 1), 0 is rolled with probability 2^-149 or 2^-150,
depending on rounding strategy and assuming that subnormals are also rolled with uniform
probabilities and no representable value is skipped. The set of representable values are
much more dense near 0 than near 1, hence the different probabilities.
I don't think that changing such a distribution by adjusting 0 to 1 has the desired
effect, at least the resulting distribution doesn't resemble a uniform distribution on (0,
1] that one would write from scratch.
Having said that, if I read it right, generate_canonical as specified P0952 effectively
treats float as a fixed-point number with 24 bit of mantissa, and never accesses the extra
precision that is available on [0, 0.5), always skipping over subnormals and more. So
there adjusting 0 to 1 works as intended, as it always produces a distribution where each
possible value has the same probability. But 1-x works here as well, as there is no more
precision to lose.
Received on 2025-12-08 00:21:09
