C++ Logo

std-proposals

Advanced search

Re: [std-proposals] std::big_int

From: Marcin Jaczewski <marcinjaczewski86_at_[hidden]>
Date: Thu, 2 Apr 2026 13:20:54 +0200
czw., 2 kwi 2026 o 12:43 Jan Schultke <janschultke_at_[hidden]> napisaƂ(a):
>>
>> > I'm not sure what you mean by "flaw" specifically here. Reference counting as a technique for std::big_int makes sense even in C++11. Move semantics allow you to also "steal" the handle to a reference counter from another std::big_int, so they interop nicely with reference counting.
>> >
>>
>> but it will work the same in normal unshared heap allocation.
>> Only copy differs as one need always call local thread allocator where
>> other trash shared cache to bump ref count. But in shared you still
>> need to allocate when you try to update local value (and trash shared
>> cache again).
>>
>> If I would need to guess I would prefer the local version, but this
>> would need some benchmarking.
>
>
> To be clear, the reason why I am interested in a reference-counting implementation is that it solves a lot of programming problems that you would otherwise have, such as making a cheap getter that returns by value.
>
> Making temporary copies is simply a common thing in numeric code as well, and code is easier to write if you don't need to sweat about the cost of negation, absolute values, temporary copies, etc. Implementing a min and max function that returns by value also becomes cheap. The load it takes off your shoulders is enormous.
>
> Another reason is the open issues with allocation that N4038 ran into. In particular, consider the "Rvalue overloads for arithmetic operations" issue:
>
> integer operator+(integer&& lhs, const integer& rhs) {
> return std::move(lsh += rhs);
> }
>
> Overloads that take rvalue references (possibly on either side, resulting in four overloads total) are actually quite reasonable because they make it possible to reuse allocations, without any effort by the user. The problem is that the interface is littered with many more overloads, and this cost is not just limited to the standard library but to any library that provides some extra numeric functionality for std::big_int. By comparison, a reference-counted approach allows the following:
>
> big_int operator+(big_int lhs, big_int rhs) {
> if (lhs.is_big() && rhs.is_big()) {
> // use the allocation with more capacity or something
> }
> else if (lhs.is_big()) {
> if (lhs.is_unique()) {
> __inplace_plus(lhs, rhs);
> return lhs;
> } else {
> return __copying_plus(lhs, rhs);
> }
> }
> else if (rhs.is_big()) {
> // ...
> }
> else {
> return big_int(_BitInt(128)(lhs.small_value) +
> _BitInt(128) (rhs.small_value));
> }
> }
>
> This is just to illustrate the principle; the is_unique() check does not work with multi-threading; you need to temporarily set the reference counter to zero if it is 1, via atomic compare exchange, to obtain temporary "unique ownership" over a big_int.
>
> While the implementation of an optimal operator+ is pretty complicated with this approach, the point is that the interface is simple, and can be improved by implementations gradually without breaking changes. By comparison, if std::big_int has unique ownership over its data, you need to design the interface around that by potentially adding these rvalue overloads, and if you don't, you incur costs everywhere that won't go away unless you add more functions to the API.
>

True, its has its benefits but atomic increment cost too and could
affect every core.
Looking at people who want 101% CPU and go out of their way to avoid
sharing cache lines between cores, seams like
this `big_int` will not be acceptable for them.
Or maybe have two ints? `big_int` and `big_int_shared`? that share
implementation but different in copy behavior.
You can cheaply move memory data between them as both have counters
but "unique" do not touch it and always have `0`.

Received on 2026-04-02 11:21:09