ISOCPP std-proposals List: Re: [std-proposals] solution proposal for Issue 2524: generate

From: Juan Lucas Rey <juanlucasrey_at_[hidden]>
Date: Fri, 5 Dec 2025 13:51:34 +0000

Hi,

"What do you mean by "discard RNGs"? Deterministically discard values
produced from RNGs?"

Yes. Assume I want to simulate 10.000 Monte Carlo paths in one process
and the next 10.000 Monte Carlo paths in another process. Assuming I
know that each path uses Y RNGs, then I would have to call
rng.discard(10000 * Y) on the second process before using it.
With p0952r2 now I need to actually produce the numbers with
"canonical_distribution", and if any of those gives 1.0 I have to add
1 to 10000 *Y. This defeats the purpose of having fast discard
methods.

"That could be achieved the same way libstdc++ does it: when
generate_canonical would return 1.0, just return
std::nextafter(RealType(1), RealType(0)) instead. We don't need a
completely different function that doesn't return values in the range
[0,1) and which needs every caller to be adjusted for the new output
range."

My proposal actually allows for more control. say for example all 10
values [RNG::max() -10, RNG::max()] return 1.0 with
canonical_distribution. libstd++ proposes to return
std::nextafter(RealType(1), RealType(0)) for these 10 values. What I
am proposing is to return different values for those 10 results,
allowing for better distribution and avoiding the loss of information
from multiple distinct RNG outputs being mapped to the same
floating-point value.

>From what you tell me then though, it is not worth pursuing this
further because of timing?

Thanks
Lucas

On Fri, 5 Dec 2025 at 13:37, Jonathan Wakely <cxx_at_[hidden]> wrote:
>
>
>
> On Fri, 5 Dec 2025 at 13:02, Juan Lucas Rey <juanlucasrey_at_[hidden]> wrote:
>>
>> Hi,
>>
>> The main motivation here is to be able to remove the issue LWG 2524
>> without creating additional issues.
>> In particular avoiding the issue mentioned in p0952r2:
>> "code that depends on a specific sequence of results from repeated
>> invocations, or on a particular number of calls to the URBG argument,
>> will be broken."
>
>
> OK, but it's too late for that. LWG 2524 has been resolved by accepting P0952 into C++26. If you want to undo that, the time for that was about 3 months ago when the C++26 draft was sent out for national body ballot (which your employer took part in).
>
>
>>
>>
>> why? I work daily with Monte Carlo simulations and if I want to
>> parallelise simulations, I have to be able to discard RNGs properly. A
>
>
> What do you mean by "discard RNGs"? Deterministically discard values produced from RNGs?
>
>>
>> lot of work has been dedicated to make those discards fast, but with
>> this proposals I would now have to actually produce the numbers
>> produced by the canonical_distribution and check them one by one to
>> make sure there has not been any additional discard.
>
>
> This motivation should be in the proposal.
>
> But why do you need to avoid additional discards? So that using one random engine across multiple threads is deterministic, with each thread invoking the random engine exactly N times? Or something else?
>
>
>>
>>
>> What I suggest is up to discussion.
>
>
> The proposal should say that. It's currently unclear.
>
>
>>
>> -maybe this centered_canonical_distribution is to be used exclusively
>> inside other distributions like std::exponential_distribution
>> exclusively to avoid LWG 2524, while not breaking the specific
>> sequence
>
>
> That could be achieved the same way libstdc++ does it: when generate_canonical would return 1.0, just return std::nextafter(RealType(1), RealType(0)) instead. We don't need a completely different function that doesn't return values in the range [0,1) and which needs every caller to be adjusted for the new output range.
>
>
>>
>> -maybe we want to expose this "centered_canonical_distribution " as a
>> generalisation of canonical_distribution (canonical_distribution would
>> be a specialisation of centered_canonical_distribution setting the
>> limit parameter to 1)
>> -the reason why I introduced this limit parameter today is to show how
>> we can use this approach to either prioritise backward compatibility
>> or a more elegant symmetric distribution. That's up for discussion and
>> I felt like giving that additional flexibility would make this a more
>> realistic proposal.
>
>
> Then the proposal should say that. You showed a function template declaration in the proposal and said that's what you're proposing. If you're actually proposing "maybe this, or maybe something similar with other properties" then you need to explain that in the proposal.
>
>
>>
>> -the symmetry around 0 does have some nice properties, but I think
>> that is of second order compared to fixing LWG 2524 and avoiding
>> issues created by p0952r2
>
>
> Then the proposal needs to be clear that "the issues created by P0952R2" are what you're trying to solve.
>
>>
>> - bear in mind that relatively recent trend of using quantisation in
>> TPUs make this more relevant: using floats with less precision means
>> that hitting a value of 1 using canonical_distribution becomes more
>> and more likely. being able to benefit from subnormal numbers on both
>> sides of 0 will definitely make this issue go away, and would allow to
>> use 32 or 64 or even 128 bit unsigned int generators to later generate
>> float16, for example, safely.
>>
>> I am new to this process so if there is any meeting or forum to
>> discuss this ( besides these emails) I would be more than happy to be
>> a part of it.
>
>
> The discussions all happened in 2023 and could have been re-opened any time up to about 3 months ago :-(
>
>>
>>
>> Lucas
>>
>> On Fri, 5 Dec 2025 at 12:32, Jonathan Wakely <cxx_at_[hidden]> wrote:
>> >
>> >
>> >
>> > On Fri, 5 Dec 2025 at 12:17, Jonathan Wakely <cxx_at_[hidden]> wrote:
>> >>
>> >> On Fri, 5 Dec 2025 at 11:03, Juan Lucas Rey wrote:
>> >>>
>> >>> Hello,
>> >>>
>> >>> Here is an example code that shows:
>> >>>
>> >>> -the original issue reproduced as shown in
>> >>> https://cplusplus.github.io/LWG/issue2524
>> >>
>> >>
>> >> For anybody else trying it, the original problem can be reproduced with libc++ but not libstdc++, because libstdc++ has a workaround in std::generate_canonical which returns the largest value of RealType which is less than RealType(1).
>> >>
>> >>>
>> >>>
>> >>> - std2::exponential_distribution, using internally
>> >>> std2::generate_canonical_centered and NOT showing the issue
>> >>> -other values are the same
>> >>>
>> >>> I have added a double template parameter "limit" to
>> >>> "std2::generate_canonical_centered". to allow maximum backwards
>> >>> compatibility, that value should be close to 1.
>> >>> setting that value to 0.5 is maybe more elegant, but less backward compatible.
>> >>
>> >>
>> >> So it's just a hardcoded "use a different calculation if the result would be greater than LIMIT" branch inside generate_canonical_centred. OK. But isn't that alternative path taken for a tiny fraction of the values from the URNG? Something like 1e-8 of the full range? So it's not symmetrical around zero? Instead it seems to return values in the range [-1e-08, 1.0).
>> >>
>> >> I thought the proposal was for a symmetric distribution around zero? So I think I'm failing to understand the proposal.
>> >
>> >
>> >
>> > Also, is that 'limit' parameter actually part of the proposal? Because it wasn't shown in the original PDF you sent, but it seems to be necessary to offer the backwards compatibility which is mentioned in the PDF and in your original email.
>> >
>> >
>> >>
>> >>
>> >> I'm not sure how very occasionally returning values below zero, i.e. below the expected range of [0,1), is better than very occasionally returning 1.0, i.e. above the expected range.
>> >>
>> >> So again, I think the idea of an alternative function that is symmetrical around zero is interesting. But I don't understand the backwards compatibility argument at all. It seems that your solution returns the same values as the old spec for std::generate_canonical in most cases, but just has a different form of failure in the problematic cases. Sure, we can adjust std::exponential_distribution to cope with the "bad" results, but std::generate_canonical_centred with a limit of 0.999999 is not centred, and fails to meet the original contract for std::generate_canonical.
>> >>
>> >> Unless I'm misunderstanding something, I think this proposal would make more sense if you dropped the backwards compatibility claims, and just demonstrated why a symmetric-around-zero distribution has valuable properties. But you should also show how it would be used by the rest of the library. You've shown how std::exponential_distribution can be adapted to work with (-0.5,0.5] but what about std::uniform_real_distribution and std::bernoulli_distribution, etc. I count 26 uses of generate_canonical in libstdc++'s <random>, should they all be changed to use generate_canonical_centred? Wouldn't that mean an extra branch and extra logic in every one of them, to handle the negative half of the range? Or maybe there's no benefit to some distributions and they should continue to use std::generate_canonical? I think the proposal needs to explain the extent of the changes that would result from introducing this new function.
>> >
>> >
>> >
>> > In other words, if the proposal is to allow std::generate_canonical_centred to be customized by callers (but maybe default to 0.5 so it's actually centred around zero?), how would that work for all the existing callers inside the std::lib? If the idea is to allow users to select an algorithm that's backwards compatible with the old spec (so that re-running simulations produces the same results as with the old spec), how would users select that when most uses of std::generate_canonical are hidden inside std::xxx_distribution and are not exposed to users directly?
>> >
>> > tl;dr who are the expected users of this new function, and how exactly would it benefit them?
>> >
>> >
>> >
>> >>
>> >>
>> >> Or maybe I'm just misunderstanding something ... in which case that suggests something in the proposal needs to be clarified.
>> >>
>> >>

Received on 2025-12-05 13:51:50