ISOCPP Std Discussion List: Re: Guarantees over addresses from function pointers created from lambda

From: Marcin Jaczewski <marcinjaczewski86_at_[hidden]>
Date: Mon, 28 Apr 2025 16:19:55 +0200

pon., 28 kwi 2025 o 14:51 Jennifier Burnett via Std-Discussion
<std-discussion_at_[hidden]> napisał(a):
>
> >If I use in my code get3 instead of get1 the lifetime of T does not begin.
>
> Yes, if you did a ctrl-f replace of "get1" with "get3" that would be incorrect. That's not what we're talking about though.
>
> What we are talking about is if the compiler can, after having generated the ASSEMBLY for get1 and get3, eliminate the definition of one of them and redirect the label to the other definition.
>
> At this point we are well beyond the boundaries of C++ and into the machine, which doesn't have any concepts of "lifetime" or "type", everything is just bytes.

But consider functions:

```
void getPtr(void* x, void* y) { *(int**)x = *(int**)y; }
void getVal(void* x, void* y) { *(int64*)x = *(int64*)y; }

void callback(Config* c, OtherData* d)
{
    if (c->getter == &getPtr) {
       c->getter(&d->ptr, &c->data);
    }
    if (c->getter == &getVal) {
       c->getter(&d->value, &c->data);
    }
}

int i = 13;
int* p = nullptr;
Config cv{&getVal, &i};
Config cp{&getPtr, &p};

setCallback(&callback);
setData(&cv);
setData(&cp);
```

You do not control `setData` but can set a global handler that will
recover the type based on `getPtr` or `getVal` value.
I had one project where I was forced to use this because the only API
I had was callback and void pointer.
When I have no guarantee for unique address I can't write code like
this, updating `callback` could not work
if the application is multithreaded.

>
> If you were able to tell the difference between the compiler running one function and another then merging the two definitions would be immediately incorrect, no further discussion needed. The compiler might merge the two functions is specifically because there is no difference between the two of them when fed into the CPU.
>
> For example, written in aarch64 assembly the three functions would look like:
>
> ```
> get1:
> ret
> get2:
> ret
> get3:
> ret
> ```
>
> What you are suggesting is that jumping to each of those 3 labels will produce different behaviour, which you can tell visually is incorrect.

Each address value is unique and checkable by CPU, this is observable behavior.
But compiler can still:
```
get1:
   nop
get2:
   nop
get3:
   ;lot of
   ;shared code
   ret
```

>
> >From the documentation I've read, what merges functions/function pointers together might also merge constants
>
> No. This is why [[no_unique_address]] exists. All objects (except empty base classes) will have addresses which are unique amongst all other objects that are not marked [[no_unique_address]].
>
> >Yes, but I do not see any difference if the hand-made vtable is in the type-erased class like I did ("inline") or somewhere else ("outline") as you proposed.
>
> There is no difference in usage. What I am proposing is that you use the address of the vtable itself to uniquely identify each type rather than the address of one of the functions in the vtable like in your original example, since as stated above each vtable will have a unique address that couldn't be confused with any of the others.
>
> On 28 April 2025 09:46:01 BST, Federico Kircheis <federico_at_[hidden]> wrote:
> >On 28/04/2025 10:16 am, Jennifier Burnett wrote:
> >>> struct T{int i;};
> >>> T* get1(void* ptr){ return start_lifetime_as<T>(ptr);}
> >>> T* get2(void* ptr){ return reinterpret_cast<T>(ptr);}
> >>> T* get3(void* ptr){ return std::launder(reinterpret_cast<T>(ptr));}
> >>>
> >>>
> >>> get1/get2/get3 might generate the same code, but it is not possible to use one instead of another.
> >>
> >> Why not? In that case they'd all do the same thing at assembly level (nothing). You're confusing abstract machine semantics and actual processor semantics. Any actual machine that tracked object lifetimes and thus had those three examples be meaningfully different would generate different machine code for them and thus make them not able to be merged.
> >
> >If I use in my code get3 instead of get1 the lifetime of T does not begin.
> >Thus if I store those 3 function pointers, and call them at runtime at the right places, but get swapped out (assuming that someone says it is fine) how can the compiler track the lifetime?
> >
> >>> If anyone has a suggestion how the concrete example I have provided can be improved, I'm all ears.
> >>
> >> Provide a cheap unique type identifier:
> >>
> >> ```
> >> template<typename T>
> >> struct unique_object { static char obj; };
> >> ```
> >>
> >> Because objects need to have unique addresses `void* id = &unique_object<T>::obj;` will be different for any given T.
> >
> >Objects need to have unique addresses just like function pointers.
> >
> >From the documentation I've read, what merges functions/function pointers together might also merge constants (yes, it is a mutable object in your example, but never modified, so is it save to assume it is not treated like a constant?)
> >
> >
> >> Better, save some space and make obj contain the function pointers (i.e. roll your own vtable).
> >
> >Yes, but I do not see any difference if the hand-made vtable is in the type-erased class like I did ("inline") or somewhere else ("outline") as you proposed.
> >
> >> Alternatively if you don't have a problem with RTTI you can just skip having to do it yourself and just make a type with virtual functions that wraps an instance of that object and let the compiler generate the vtable and type comparison for you:
> >>
> >> ```
> >> struct interface
> >> {
> >> virtual interface() {}
> >> virtual copy_assign(void* lhs, void*) = 0;
> >> ...
> >> };
> >> template<typename T>
> >> struct wrapper : public interface
> >> {
> >> T value;
> >>
> >> override wrapper() = default;
> >>
> >> override copy_assign(void* lhs, void* rhs)
> >> {
> >> ...
> >> }
> >> ...
> >> };
> >> bool is_same_type(interface* lhs, interface* rhs)
> >> {
> >> return typeid(*lhs) == typeid(*rhs).
> >> }
> >> ```
> >>
> >> Then you just store `wrapper<T>` instead of the object
> >
> >Yes, I was avoiding RTTI.
> >
> >> On 28 April 2025 08:32:18 BST, Federico Kircheis via Std-Discussion <std-discussion_at_[hidden]> wrote:
> >>> On 28/04/2025 7:56 am, Tiago Freire via Std-Discussion wrote:
> >>>> I’m not convinced by this example. I have written many callback systems and never has “uniqueness” been relevant.
> >>>>
> >>>> Is that a good design? Or one that can’t be improved upon?
> >>>>
> >>>> Had C++ not provided this guarantee all along, would the algorithm (in the context of its intended end use) be undesignable?
> >>>>
> >>>> As mentioned, the only reason it would make a difference is because you are trying to do logic related to the function beyond “it exists, and it is callable”, and whatever that is there’s probably a better solution to be found elsewhere.
> >>>>
> >>>> In this case, what you want is a unique token and absent a feature for that you used an unrelated “this function address is unique” to achieve that.
> >>>>
> >>>> I’m not saying that maybe it isn’t necessary in this case, I don’t see the concrete example you might have in mind.
> >>>
> >>>
> >>> If anyone has a suggestion how the concrete example I have provided can be improved, I'm all ears.
> >>>
> >>> But the problem appears in other context too:
> >>>
> >>> struct T{int i;};
> >>> T* get1(void* ptr){ return start_lifetime_as<T>(ptr);}
> >>> T* get2(void* ptr){ return reinterpret_cast<T>(ptr);}
> >>> T* get3(void* ptr){ return std::launder(reinterpret_cast<T>(ptr));}
> >>>
> >>>
> >>> get1/get2/get3 might generate the same code, but it is not possible to use one instead of another.
> >>>
> >>>> But if the address being unique is important for this function, wouldn’t it be better to explicit request it with a tag [[unique_address]] and not make it the default everywhere?
> >>>
> >>> Attributes can be ignored.
> >>> The correctness of a program should not rely on something that can be ignored.
> >>>
> >>>> *From:*Tom Honermann <tom_at_[hidden]>
> >>>> *Sent:* Monday, April 28, 2025 6:08 AM
> >>>> *To:* std-discussion_at_[hidden]
> >>>> *Cc:* Tiago Freire <tmiguelf_at_[hidden]>
> >>>> *Subject:* Re: [std-discussion] Guarantees over addresses from function pointers created from lambda
> >>>>
> >>>> On 4/27/25 6:53 AM, Tiago Freire via Std-Discussion wrote:
> >>>>
> >>>> > I'm not sure how you could do that other than maybe dropping
> >>>> function pointer comparison altogether except against nullptr.
> >>>>
> >>>> I don't see a reason not to do that.
> >>>>
> >>>> Comparisons can be useful to support callback function registration systems. Designs that use the function pointer itself as the registration key depend on registered functions having exactly one unique address assigned to them.
> >>>>
> >>>> Tom.
> >>>>
> >>>> ------------------------------------------------------------------------
> >>>>
> >>>> *From:* Std-Discussion <std-discussion-bounces_at_[hidden]>
> >>>> <mailto:std-discussion-bounces_at_[hidden]> on behalf of
> >>>> Jennifier Burnett via Std-Discussion <std-
> >>>> discussion_at_[hidden]> <mailto:std-discussion_at_[hidden]>
> >>>> *Sent:* Sunday, April 27, 2025 12:06:36 PM
> >>>> *To:* std-discussion_at_[hidden] <mailto:std-
> >>>> discussion_at_[hidden]> <std-discussion_at_[hidden]>
> >>>> <mailto:std-discussion_at_[hidden]>; Nate Eldredge via Std-
> >>>> Discussion <std-discussion_at_[hidden]> <mailto:std-
> >>>> discussion_at_[hidden]>
> >>>> *Cc:* Jennifier Burnett <jenni_at_[hidden]>
> >>>> <mailto:jenni_at_[hidden]>
> >>>> *Subject:* Re: [std-discussion] Guarantees over addresses from
> >>>> function pointers created from lambda
> >>>>
> >>>> I have worked on one codebase that something did this, for basically
> >>>> the same reason as the original example in this thread where they
> >>>> were rolling their own vtables for a type erased callback storage
> >>>> class, and they used a sentinel function to identify trivially
> >>>> copyable and trivially destructible classes to skip the function
> >>>> call. Nullptr was used to indicate an empty callback.
> >>>>
> >>>> In both cases there were branch mispredictions happening on the
> >>>> check for the sentinel (plus additional instructions were needed to
> >>>> form the sentinel address) and it was just faster to call an empty
> >>>> function for the trivial destructor and have the trivial copy just
> >>>> use a function that called memcpy. The original company as far as
> >>>> I'm aware didn't merge our changes back into their codebase.
> >>>>
> >>>> I've also worked on a different codebase which was using a function
> >>>> pointer to a templated function as a form of cheap RTTI on a type
> >>>> erased container (games, so nobody ships codebases with it enabled).
> >>>>
> >>>> Ideally if it was dropped you'd want existing code relying on it to
> >>>> break at compile time, I'm not sure how you could do that other than
> >>>> maybe dropping function pointer comparison altogether except against
> >>>> nullptr.>
> >>
> >
> >>>>
> >>>> On 27 April 2025 03:49:37 BST, Nate Eldredge via Std-Discussion
> >>>> <std-discussion_at_[hidden]> <mailto:std-
> >>>> discussion_at_[hidden]> wrote:
> >>>>
> >>>> On Apr 26, 2025, at 11:34, Andrey Semashev via Std-Discussion
> >>>> <std-discussion_at_[hidden]> <mailto:std-
> >>>> discussion_at_[hidden]> wrote:
> >>>>
> >>>>
> >>>> The point is, even if the standard guarantees this [pointers
> >>>> to different functions comparing unequal], it's probably not a
> >>>> good idea to rely on this in practice.
> >>>>
> >>>> This being so, does anyone know if there has ever been a formal
> >>>> proposal to weaken this rule, or informal study of doing so?
> >>>>
> >>>> I can certainly see the aesthetic argument for the rule, but I
> >>>> wonder how much real-life code actually relies on it. The only
> >>>> specific use case I can see is sentinels:
> >>>>
> >>>> void sentinel() { }
> >>>>
> >>>> void foo(void (*callback)()) {
> >>>>
> >>>> if (!callback) {
> >>>>
> >>>> proceed_without_callback();
> >>>>
> >>>> } else if (callback == sentinel) {
> >>>>
> >>>> something_special();
> >>>>
> >>>> } else {
> >>>>
> >>>> callback();
> >>>>
> >>>> }
> >>>>
> >>>> }
> >>>>
> >>>> but it seems like a modern C++ programmer would rather do that
> >>>> with std::variant <std::variant> or some other way. So I wonder
> >>>> if anyone has studied the impact on existing code bases of
> >>>> dropping this guarantee.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >
> --
> Std-Discussion mailing list
> Std-Discussion_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/std-discussion

Received on 2025-04-28 14:20:12