<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, 28 Aug 2024 at 14:13, Oliver Schädlich via Std-Proposals &lt;<a href="mailto:std-proposals@lists.isocpp.org">std-proposals@lists.isocpp.org</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><u></u>

  

    
  
  <div>
    <p>I recently have presented my idea for an improvement of
      shared_ptr&lt;&gt; in combination with<br>
      atomic&lt;shared_ptr&lt;&gt;&gt; here, but I was misunderstood. My
      idea was simply that shared_ptr&lt;&gt;<br>
      should add a new assignement-operator with a
      atomic&lt;shared_ptr&lt;&gt;&gt; as its parameter. The<br>
      trick with that would be that the assignment-operator could check
      if both pointers point to<br>
      the same data and then it could do nothing.<br>
      If you have a RCU-like pattern the participating theads could
      simply keep a thread_local<br>
      shared_ptr&lt;&gt; which is periodically updated with such an
      assignment. Like with RCU-like<br>
      patterns the central atomic&lt;shared_ptr&lt;&gt;&gt; would be
      updated rarely and the comparison<br>
      would be very quick on mostly shared cachelines.<br>
      With current C++ load()ing from a atomic&lt;shared_ptr&lt;&gt;&gt;
      is extremely slow, even when you<br>
      have a lock-free solution since the cacheline-flipping between the
      cores is rather expensive.<br>
      I&#39;ve written my own shared_obj&lt;&gt;- (like shared_ptr&lt;&gt;)
      and tshared_obj&lt;&gt;-classes (like atomic<br>
      &lt;shared_ptr&lt;&gt;&gt;) and with this optimization I get a
      speedup of factor 5,000 when 32 threads<br>
      are contending on a single thared_obj&lt;&gt;, i.e. constantly
      copying it&#39;s pointer to a shared_obj&lt;&gt;<br>
      object while the tshared_obj is rarely updated.<br>
      <br>
      This is the code:</p>
    <p><font face="monospace">template&lt;typename T&gt;<br>
        shared_obj&lt;T&gt; &amp;shared_obj&lt;T&gt;::operator =(
        tshared_obj&lt;T&gt; const &amp;ts ) noexcept<br>
        {<br>
            using namespace std;<br>
            // important optimization for RCU-like patterns:<br>
            // we&#39;re managing the same object like the thread-safe
        object-pointer ?<br>
            if( (size_t)m_ctrl == ts.m_dca.first().load(
        memory_order_relaxed ) ) [[likely]]<br></font></p></div></blockquote><div><br></div><div>The [[likely]] branch prediction is specific to your use case, a similar function in the standard couldn&#39;t make that assumption.</div><div><br></div><div>What is the (size_t) cast for? Should that be uintptr_t?<br></div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><p><font face="monospace">
                // yes: nothing to do<br>
                return *this;<br>
            // erase our object pointer<br>
            this-&gt;~shared_obj();<br></font></p></div></blockquote><div><br></div><div>This looks like undefined behaviour, since you destroy *this and don&#39;t create a new object in its place.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><p><font face="monospace">
            // copy() increments the reference count for ourself<br>
            m_ctrl = ts.copy();<br>
            return *this;<br>
        }</font></p></div></blockquote><div><br></div><div>What&#39;s the actual optimization happening here? Not performing a reference count increment/decrement pair in the &quot;do nothing&quot; case? Because above you say that load() is slow, is that specifically atomic&lt;shared_ptr&lt;T&gt;&gt;::load() which increments the ref count, or just doing any atomic load from another cacheline? Because your code above still does an atomic load.<br></div><div><br></div><div>Is your m_ctrl just a pointer? Because std::shared_ptr needs to update two pointers, not one, so I&#39;m not sure how this actually relates to std::shared_ptr. What is the equality above equivalent to, shared_ptr&#39;s operator== or std::owner_equal? Or both?<br></div><br></div></div>

