Date: Fri, 5 Dec 2025 13:02:25 +0000
Hi,
The main motivation here is to be able to remove the issue LWG 2524
without creating additional issues.
In particular avoiding the issue mentioned in p0952r2:
"code that depends on a specific sequence of results from repeated
invocations, or on a particular number of calls to the URBG argument,
will be broken."
why? I work daily with Monte Carlo simulations and if I want to
parallelise simulations, I have to be able to discard RNGs properly. A
lot of work has been dedicated to make those discards fast, but with
this proposals I would now have to actually produce the numbers
produced by the canonical_distribution and check them one by one to
make sure there has not been any additional discard.
What I suggest is up to discussion.
-maybe this centered_canonical_distribution is to be used exclusively
inside other distributions like std::exponential_distribution
exclusively to avoid LWG 2524, while not breaking the specific
sequence
-maybe we want to expose this "centered_canonical_distribution " as a
generalisation of canonical_distribution (canonical_distribution would
be a specialisation of centered_canonical_distribution setting the
limit parameter to 1)
-the reason why I introduced this limit parameter today is to show how
we can use this approach to either prioritise backward compatibility
or a more elegant symmetric distribution. That's up for discussion and
I felt like giving that additional flexibility would make this a more
realistic proposal.
-the symmetry around 0 does have some nice properties, but I think
that is of second order compared to fixing LWG 2524 and avoiding
issues created by p0952r2
- bear in mind that relatively recent trend of using quantisation in
TPUs make this more relevant: using floats with less precision means
that hitting a value of 1 using canonical_distribution becomes more
and more likely. being able to benefit from subnormal numbers on both
sides of 0 will definitely make this issue go away, and would allow to
use 32 or 64 or even 128 bit unsigned int generators to later generate
float16, for example, safely.
I am new to this process so if there is any meeting or forum to
discuss this ( besides these emails) I would be more than happy to be
a part of it.
Lucas
On Fri, 5 Dec 2025 at 12:32, Jonathan Wakely <cxx_at_[hidden]> wrote:
>
>
>
> On Fri, 5 Dec 2025 at 12:17, Jonathan Wakely <cxx_at_[hidden]> wrote:
>>
>> On Fri, 5 Dec 2025 at 11:03, Juan Lucas Rey wrote:
>>>
>>> Hello,
>>>
>>> Here is an example code that shows:
>>>
>>> -the original issue reproduced as shown in
>>> https://cplusplus.github.io/LWG/issue2524
>>
>>
>> For anybody else trying it, the original problem can be reproduced with libc++ but not libstdc++, because libstdc++ has a workaround in std::generate_canonical which returns the largest value of RealType which is less than RealType(1).
>>
>>>
>>>
>>> - std2::exponential_distribution, using internally
>>> std2::generate_canonical_centered and NOT showing the issue
>>> -other values are the same
>>>
>>> I have added a double template parameter "limit" to
>>> "std2::generate_canonical_centered". to allow maximum backwards
>>> compatibility, that value should be close to 1.
>>> setting that value to 0.5 is maybe more elegant, but less backward compatible.
>>
>>
>> So it's just a hardcoded "use a different calculation if the result would be greater than LIMIT" branch inside generate_canonical_centred. OK. But isn't that alternative path taken for a tiny fraction of the values from the URNG? Something like 1e-8 of the full range? So it's not symmetrical around zero? Instead it seems to return values in the range [-1e-08, 1.0).
>>
>> I thought the proposal was for a symmetric distribution around zero? So I think I'm failing to understand the proposal.
>
>
>
> Also, is that 'limit' parameter actually part of the proposal? Because it wasn't shown in the original PDF you sent, but it seems to be necessary to offer the backwards compatibility which is mentioned in the PDF and in your original email.
>
>
>>
>>
>> I'm not sure how very occasionally returning values below zero, i.e. below the expected range of [0,1), is better than very occasionally returning 1.0, i.e. above the expected range.
>>
>> So again, I think the idea of an alternative function that is symmetrical around zero is interesting. But I don't understand the backwards compatibility argument at all. It seems that your solution returns the same values as the old spec for std::generate_canonical in most cases, but just has a different form of failure in the problematic cases. Sure, we can adjust std::exponential_distribution to cope with the "bad" results, but std::generate_canonical_centred with a limit of 0.999999 is not centred, and fails to meet the original contract for std::generate_canonical.
>>
>> Unless I'm misunderstanding something, I think this proposal would make more sense if you dropped the backwards compatibility claims, and just demonstrated why a symmetric-around-zero distribution has valuable properties. But you should also show how it would be used by the rest of the library. You've shown how std::exponential_distribution can be adapted to work with (-0.5,0.5] but what about std::uniform_real_distribution and std::bernoulli_distribution, etc. I count 26 uses of generate_canonical in libstdc++'s <random>, should they all be changed to use generate_canonical_centred? Wouldn't that mean an extra branch and extra logic in every one of them, to handle the negative half of the range? Or maybe there's no benefit to some distributions and they should continue to use std::generate_canonical? I think the proposal needs to explain the extent of the changes that would result from introducing this new function.
>
>
>
> In other words, if the proposal is to allow std::generate_canonical_centred to be customized by callers (but maybe default to 0.5 so it's actually centred around zero?), how would that work for all the existing callers inside the std::lib? If the idea is to allow users to select an algorithm that's backwards compatible with the old spec (so that re-running simulations produces the same results as with the old spec), how would users select that when most uses of std::generate_canonical are hidden inside std::xxx_distribution and are not exposed to users directly?
>
> tl;dr who are the expected users of this new function, and how exactly would it benefit them?
>
>
>
>>
>>
>> Or maybe I'm just misunderstanding something ... in which case that suggests something in the proposal needs to be clarified.
>>
>>
The main motivation here is to be able to remove the issue LWG 2524
without creating additional issues.
In particular avoiding the issue mentioned in p0952r2:
"code that depends on a specific sequence of results from repeated
invocations, or on a particular number of calls to the URBG argument,
will be broken."
why? I work daily with Monte Carlo simulations and if I want to
parallelise simulations, I have to be able to discard RNGs properly. A
lot of work has been dedicated to make those discards fast, but with
this proposals I would now have to actually produce the numbers
produced by the canonical_distribution and check them one by one to
make sure there has not been any additional discard.
What I suggest is up to discussion.
-maybe this centered_canonical_distribution is to be used exclusively
inside other distributions like std::exponential_distribution
exclusively to avoid LWG 2524, while not breaking the specific
sequence
-maybe we want to expose this "centered_canonical_distribution " as a
generalisation of canonical_distribution (canonical_distribution would
be a specialisation of centered_canonical_distribution setting the
limit parameter to 1)
-the reason why I introduced this limit parameter today is to show how
we can use this approach to either prioritise backward compatibility
or a more elegant symmetric distribution. That's up for discussion and
I felt like giving that additional flexibility would make this a more
realistic proposal.
-the symmetry around 0 does have some nice properties, but I think
that is of second order compared to fixing LWG 2524 and avoiding
issues created by p0952r2
- bear in mind that relatively recent trend of using quantisation in
TPUs make this more relevant: using floats with less precision means
that hitting a value of 1 using canonical_distribution becomes more
and more likely. being able to benefit from subnormal numbers on both
sides of 0 will definitely make this issue go away, and would allow to
use 32 or 64 or even 128 bit unsigned int generators to later generate
float16, for example, safely.
I am new to this process so if there is any meeting or forum to
discuss this ( besides these emails) I would be more than happy to be
a part of it.
Lucas
On Fri, 5 Dec 2025 at 12:32, Jonathan Wakely <cxx_at_[hidden]> wrote:
>
>
>
> On Fri, 5 Dec 2025 at 12:17, Jonathan Wakely <cxx_at_[hidden]> wrote:
>>
>> On Fri, 5 Dec 2025 at 11:03, Juan Lucas Rey wrote:
>>>
>>> Hello,
>>>
>>> Here is an example code that shows:
>>>
>>> -the original issue reproduced as shown in
>>> https://cplusplus.github.io/LWG/issue2524
>>
>>
>> For anybody else trying it, the original problem can be reproduced with libc++ but not libstdc++, because libstdc++ has a workaround in std::generate_canonical which returns the largest value of RealType which is less than RealType(1).
>>
>>>
>>>
>>> - std2::exponential_distribution, using internally
>>> std2::generate_canonical_centered and NOT showing the issue
>>> -other values are the same
>>>
>>> I have added a double template parameter "limit" to
>>> "std2::generate_canonical_centered". to allow maximum backwards
>>> compatibility, that value should be close to 1.
>>> setting that value to 0.5 is maybe more elegant, but less backward compatible.
>>
>>
>> So it's just a hardcoded "use a different calculation if the result would be greater than LIMIT" branch inside generate_canonical_centred. OK. But isn't that alternative path taken for a tiny fraction of the values from the URNG? Something like 1e-8 of the full range? So it's not symmetrical around zero? Instead it seems to return values in the range [-1e-08, 1.0).
>>
>> I thought the proposal was for a symmetric distribution around zero? So I think I'm failing to understand the proposal.
>
>
>
> Also, is that 'limit' parameter actually part of the proposal? Because it wasn't shown in the original PDF you sent, but it seems to be necessary to offer the backwards compatibility which is mentioned in the PDF and in your original email.
>
>
>>
>>
>> I'm not sure how very occasionally returning values below zero, i.e. below the expected range of [0,1), is better than very occasionally returning 1.0, i.e. above the expected range.
>>
>> So again, I think the idea of an alternative function that is symmetrical around zero is interesting. But I don't understand the backwards compatibility argument at all. It seems that your solution returns the same values as the old spec for std::generate_canonical in most cases, but just has a different form of failure in the problematic cases. Sure, we can adjust std::exponential_distribution to cope with the "bad" results, but std::generate_canonical_centred with a limit of 0.999999 is not centred, and fails to meet the original contract for std::generate_canonical.
>>
>> Unless I'm misunderstanding something, I think this proposal would make more sense if you dropped the backwards compatibility claims, and just demonstrated why a symmetric-around-zero distribution has valuable properties. But you should also show how it would be used by the rest of the library. You've shown how std::exponential_distribution can be adapted to work with (-0.5,0.5] but what about std::uniform_real_distribution and std::bernoulli_distribution, etc. I count 26 uses of generate_canonical in libstdc++'s <random>, should they all be changed to use generate_canonical_centred? Wouldn't that mean an extra branch and extra logic in every one of them, to handle the negative half of the range? Or maybe there's no benefit to some distributions and they should continue to use std::generate_canonical? I think the proposal needs to explain the extent of the changes that would result from introducing this new function.
>
>
>
> In other words, if the proposal is to allow std::generate_canonical_centred to be customized by callers (but maybe default to 0.5 so it's actually centred around zero?), how would that work for all the existing callers inside the std::lib? If the idea is to allow users to select an algorithm that's backwards compatible with the old spec (so that re-running simulations produces the same results as with the old spec), how would users select that when most uses of std::generate_canonical are hidden inside std::xxx_distribution and are not exposed to users directly?
>
> tl;dr who are the expected users of this new function, and how exactly would it benefit them?
>
>
>
>>
>>
>> Or maybe I'm just misunderstanding something ... in which case that suggests something in the proposal needs to be clarified.
>>
>>
Received on 2025-12-05 13:02:43
