C++ Logo

sg12

Advanced search

Re: [isocpp-ext] Total order for pointer comparisons (again)

From: Alisdair Meredith <alisdairm_at_[hidden]>
Date: Sat, 25 Nov 2023 04:35:51 +0700
I have been looking into writing a paper that suggests making a “flat address space” conditionally supported. In a flat address space, there is a total order for addresses, and pointers would compare only their addresses, without regard to provenance in the abstract machine. This would satisfy naive user expectations, removing another “embarrassment” from the language, while allowing for a compiler switch to re-enable the pointer provenance model and the (often surprising) optimization opportunities it grants.

The main problem is I am already wrestling with too many half-written papers, so the odds of my completing this particular paper in time for Tokyo seem low, although I would really like to address this topic for C++26.

AlisdairM
Sent from my iPhone

> On Nov 25, 2023, at 01:23, Giuseppe D'Angelo via Ext <ext_at_[hidden]> wrote:
>
> Hello,
>
> I'd like to resume this old discussion:
>
> https://lists.isocpp.org/sg12/2013/08/0095.php
>
> I'm wondering if there has been a conclusion to it, and especially if there's some appetite for (finally) having language comparison operators always establish a total order between pointers.
>
> --
>
> For some more context: while reviewing some code earlier, I realized something which was already discussed by that old thread: the standard comparison functions objects (less, greater, compare_three_way etc.) guarantee a (total) order between pointers, but this guarantee ceases to exist if one compares a "box" containing a pointer. The result is instead unspecified, and in certain contexts becomes a source of UB.
>
> "Box" in this context is anything that wraps a pointer and uses it as its own "salient state" for comparisons. The Standard Library is full of them: containers (e.g. a vector of pointers); vocabulary types (e.g. pair, tuple, optional, expected, variant; soon reference_wrapper with P2944; the recently discussed indirect); likely more. User code is also of course full of them, and since C++20 we have a language feature (defaulted operator<=>) that follows the same direction: apply the underlying language operator to the base/member sub-objects.
>
> The reason why a comparison object fails to guarantee a total order when comparing boxes is because the comparison will simply call the respective _language_ comparison operator on the box; and, as per the current specification, the box will then always use the language comparison operator on its salient state.
>
>
> In not so many words, TL;DR:
>
> * using `std::less<int*>` to compare pointers guarantees a total order;
>
> * using `std::less<std::tuple<int *>>` to compare two tuples doesn't, because it will just call `operator<` between `tuple<int*>`, and that calls `operator<` between pointers, and that result is unspecified.
>
> This can easily become UB: `std::set<std::tuple<int *>>` may violate the Compare requirements.
>
> Of course the same applies for any other comparison object type and any other box type.
>
> --
>
> Judging from that old thread, I am not the only one that doesn't really like this status quo. I'd like to reason about some possible ways out of this:
>
>
> 0) Do nothing, nowadays we have ASAN, which is actually able to flag these mistakes (-fsanitize=pointer-compare). I don't think that doing nothing "because tooling will catch it" is an acceptable solution (in general, and even more in this case, where this particular tooling is runtime, invasive, and expensive to run).
>
>
>
> 1) Have comparison operators between pointers always establish a total order. This is obviously implementable everywhere in C++, as std::less exists. This would also be my preferred solution, killing a long-standing source of unspecified/undefined behaviour, simplifying the language, removing surprises for non-experts, and making the above just work™.
>
> Downsides: I have no idea if this would incur in unacceptable costs on some platforms (which ones? why?), nor if C would like to follow suit (by the way, why is the result of pointer comparison "merely" unspecified behaviour in C++, but undefined behaviour in C?).
>
> So, 10 years after the original thread: could we discuss this possibility again? What's the gain/loss ratio for mandating total ordering in pointers starting from 2026?
>
>
>
> 2) Change the behaviour of the "boxes" in the Standard Library by having their comparison operators NOT use the corresponding language operator on their salient state, but a function object that guarantees pointer ordering. For instance, assuming `operator<` between tuples were still there, change it to use `std::less` on each corresponding member of the two tuples being compared, instead of `operator<`.
>
> This is perhaps doable, but sounds tremendously invasive, doesn't work with a defaulted `operator<=>`, constitutes a massive burden to teach.
>
>
>
> 3) Introduce yet another "box", e.g. `compare_wrapper<T>`, that yields a total order when T is a pointer, otherwise just forwards the comparisons.
>
> This sounds extremely tedious to use in practice; a library band-aid for a language problem; and a semantic departure from all the other standard library boxes (which would still have a "trap").
>
>
>
> 4) Concoct yet another comparison framework in the language/library, where the Box being compared defines the comparison strategy for its salient state (e.g. compare sub-objects in lexicographic order, like tuple; or check for emptiness and then compare, like optional; etc.), and the user provides a comparison function.
>
> This comparison function is then applied according to the Box's strategy (and using some form of structural recursion).
>
> Something like `COMPARE(std::less, tupleA, tupleB)` (invented syntax) would have the tuples apply std::less on their subobjects (recursively); `COMPARE(case_insensitive, optStringA, optStringB)` would do case insensitive string comparison (on top of optional's own semantics), and so on.
>
> (This is akin to some proposals for "better hashing" that have been floated some time ago.)
>
> While fascinating and useful, sounds like a massive effort, especially in a language without reflection. It also sounds really confusing to have yet another way to compare objects now that we've "just" standardized <=>.
>
> ---
>
> As I've mentioned, I'd strongly prefer to kill the entire problem at the source by standardizing a total order between pointers, and stop building fancy workarounds on top of it. But I'd like of course to hear more opinions.
>
>
> Thank you for your time,
> --
> Giuseppe D'Angelo
> _______________________________________________
> Ext mailing list
> Ext_at_[hidden]
> Subscription: https://lists.isocpp.org/mailman/listinfo.cgi/ext
> Link to this post: http://lists.isocpp.org/ext/2023/11/22198.php

Received on 2023-11-24 21:36:08