Date: Tue, 27 Aug 2013 08:48:50 -0700
Thanks for chiming in!
On Tue, Aug 27, 2013 at 1:00 AM, Xavier Leroy <Xavier.Leroy_at_[hidden]> wrote:
>> [Semantics of pointer comparisons]
>> Xavier Leroy, the author of CompCert, and I have had long discussions
>> about this about a year ago. He is on the list, so he will add
>> clarifications or corrections when he gets a chance. Last time (last
>> Spring) we talked again about this, he said he has made some adjuments,
>> but I don't recall the details.
>
> Yes, I'll chime in. Just a bit of context for the members of this
> list: as part of the formal verification of the CompCert compiler, I
> and others have been giving formal semantics to a large subset of ISO
> C, and (with Tahina Ramananandro and Gabriel Dos Reis) to some aspects
> of C++ too.
>
> >From the standpoint of formal semantics, it is easier to account for
> undefined behaviors (when everything can happen, incl. crashing) than
> for defined-but-unspecified behaviors (when several but not all
> outcomes are possible). And of course accounting for
> defined-and-uniquely-specified behaviors is easier too.
>
> The example that Gabriel was alluding to is equality tests between a
> pointer "one past" the end of an object and a pointer to the beginning
> of another object. In C and C++, if I'm not mistaken, this is defined
> behavior but can return true or false at the whim of the
> implementation. This nondeterminism was difficult to account for in
> CompCert's semantics, so in the end we treat it as undefined behavior,
> while still defining comparisons involving pointers "one past" as long
> as the pointers are within the same object.
>
> Coming back to the proposal of making "<" a total (but unspecified
> order on pointers):
>
> - As a semanticist, I think it would be challenging to formalize, for
> the general reasons given above.
How do you model std::less<T*>, which already requires this total but
unspecified order?
> - As a programmer, I see how this could be very useful to implement
> various generic data structures, or to choose a "canonical"
> representative between several semantically-equivalent data structures.
>
> - As a compiler writer, I can see a few useful optimizations that this
> change would invalidate. Here is an example:
>
> int x[2];
> int * p;
> ...
> if (p <= x + 1) {...} else {...}
>
> With the current semantics, "p <= x + 1" is only defined if p points
> within x, so it is equivalent to "p != x + 2", which gives useful
> aliasing information in the two branches of the conditional.
> This is a contrived example, though.
>
> - As an amateur of garbage-collected languages, I see a problem with
> relocating garbage collection. (C++ doesn't require a GC, but it is
> my understanding that the C++ committee is trying to make C++
> compatible with the use of a GC.) A relocating GC can move objects
> around, changing their absolute addresses. So, the unspecified
> total order on pointers could change during the execution of the
> program: before a GC, we have "p < q"; then q is moved, and then
> "p > q". This would make the total order unusable to
> e.g. order binary search trees...
On Tue, Aug 27, 2013 at 1:00 AM, Xavier Leroy <Xavier.Leroy_at_[hidden]> wrote:
>> [Semantics of pointer comparisons]
>> Xavier Leroy, the author of CompCert, and I have had long discussions
>> about this about a year ago. He is on the list, so he will add
>> clarifications or corrections when he gets a chance. Last time (last
>> Spring) we talked again about this, he said he has made some adjuments,
>> but I don't recall the details.
>
> Yes, I'll chime in. Just a bit of context for the members of this
> list: as part of the formal verification of the CompCert compiler, I
> and others have been giving formal semantics to a large subset of ISO
> C, and (with Tahina Ramananandro and Gabriel Dos Reis) to some aspects
> of C++ too.
>
> >From the standpoint of formal semantics, it is easier to account for
> undefined behaviors (when everything can happen, incl. crashing) than
> for defined-but-unspecified behaviors (when several but not all
> outcomes are possible). And of course accounting for
> defined-and-uniquely-specified behaviors is easier too.
>
> The example that Gabriel was alluding to is equality tests between a
> pointer "one past" the end of an object and a pointer to the beginning
> of another object. In C and C++, if I'm not mistaken, this is defined
> behavior but can return true or false at the whim of the
> implementation. This nondeterminism was difficult to account for in
> CompCert's semantics, so in the end we treat it as undefined behavior,
> while still defining comparisons involving pointers "one past" as long
> as the pointers are within the same object.
>
> Coming back to the proposal of making "<" a total (but unspecified
> order on pointers):
>
> - As a semanticist, I think it would be challenging to formalize, for
> the general reasons given above.
How do you model std::less<T*>, which already requires this total but
unspecified order?
> - As a programmer, I see how this could be very useful to implement
> various generic data structures, or to choose a "canonical"
> representative between several semantically-equivalent data structures.
>
> - As a compiler writer, I can see a few useful optimizations that this
> change would invalidate. Here is an example:
>
> int x[2];
> int * p;
> ...
> if (p <= x + 1) {...} else {...}
>
> With the current semantics, "p <= x + 1" is only defined if p points
> within x, so it is equivalent to "p != x + 2", which gives useful
> aliasing information in the two branches of the conditional.
> This is a contrived example, though.
>
> - As an amateur of garbage-collected languages, I see a problem with
> relocating garbage collection. (C++ doesn't require a GC, but it is
> my understanding that the C++ committee is trying to make C++
> compatible with the use of a GC.) A relocating GC can move objects
> around, changing their absolute addresses. So, the unspecified
> total order on pointers could change during the execution of the
> program: before a GC, we have "p < q"; then q is moved, and then
> "p > q". This would make the total order unusable to
> e.g. order binary search trees...
Received on 2013-08-27 17:49:15