Date: Wed, 21 Aug 2019 09:01:00 -0400
Well the implementation of a standard library function by the implementation never invokes UB: they get to break all sorts of rule. It need not even be written in C++.As for the reason why the first code snippet is UB, it's because as of C++17, pointers point to objects, not addresses. So when we do that reinterpret_cast, there is no existing rule stating that it binds to any char object - it still points to the original object, but through a pointer to a different type. That means, when you perform pointer arithmetic, it violates [expr.add] p6 (type of the pointer is not similar to the type it points to).Sent from my Samsung Galaxy smartphone.
-------- Original message --------From: Timur Doumler via Std-Proposals <std-proposals_at_[hidden]> Date: 8/21/19 04:39 (GMT-05:00) To: std-proposals_at_[hidden] Cc: Timur Doumler <cpp_at_[hidden]> Subject: Re: [std-proposals] Allowing access to object representations And if pointer arithmetic is not allowed, does that mean that all existing implementations of functions like, say, std::memcmp, are UB?Cheers,TimurOn 21 Aug 2019, at 10:34, Timur Doumler via Std-Proposals <std-proposals_at_[hidden]> wrote:Wow, I didn’t know this.So instead of doing this (= your program, except I fixed x to &x)int x = 12345;auto p = reinterpret_cast<unsigned char*>(&x);for (int i = 0; i < sizeof(x); i++) { std::cout << p[i] << '\n';}I need to do this to avoid UB?int x = 12345;unsigned char buf[sizeof(x)];std::memcpy(&buf, &x, sizeof(x));for (auto c : buf) std::cout << c << '\n';Of course, both programs compile to the exact same code with -O3.This is interesting and I was not aware of this. If a char array can provide storage for an object, and a pointer to any element in the array can alias any other type, it is quite surprising that the first program is UB. Is it just because, as you saty, you’re not allowed to do pointer arithmetic on a pointer obtained that way, or is there another reason?Cheers,TimurOn 21 Aug 2019, at 10:22, sdkrystian via Std-Proposals <std-proposals_at_[hidden]> wrote:Using reinterpret_cast, you can access the first element, but thats about it (pointer arithmetic is UB)Sent from my Samsung Galaxy smartphone.-------- Original message --------From: Timur Doumler via Std-Proposals <std-proposals_at_[hidden]> Date: 8/21/19 02:23 (GMT-05:00) To: Brian Bi <bbi5291_at_[hidden]> Cc: Timur Doumler <cpp_at_[hidden]>, std-proposals_at_[hidden] Subject: Re: [std-proposals] Allowing access to object representations That's interesting, thanks for the explanation!So how would I, in C++17, print the bytes that make up the object representation of int x, without causing UB?It ought to be possible without memcpying, because char* can alias any other type, including int, but now I am not sure anymore how to correctly write the code that does what your snippet does.On 21 Aug 2019, at 01:38, Brian Bi <bbi5291_at_[hidden]> wrote:On Tue, Aug 20, 2019 at 8:24 AM Timur Doumler <cpp_at_[hidden]> wrote:Hi Brian,
> On 19 Aug 2019, at 23:23, Brian Bi via Std-Proposals <std-proposals_at_[hidden]> wrote:
> Exactly - that means it's still undefined. As I said in one of my earlier messages, it is undesirable to change the standard in a way that breaks lots of code which people will then not rewrite, as this erodes the legitimacy of the standard (and leaves users uncertain about what might get trampled by their compiler optimizers the next time they update.) Yet that's exactly what happened in C++17, and we should fix that.
Could you please explain exactly what change in C++17 you are referring to here?The change to the object/memory model made by P0137. At the very least, this breaks any code similar to the following:int x = 12345;auto p = reinterpret_cast<unsigned char*>(x);for (int i = 0; i < sizeof(x); i++) { std::cout << p[i] << '\n';}In Core Issue 1314, CWG said that "the current wording":The object representation of an object of type T is
the sequence of N unsigned char objects taken up by
the object of type T, where N equals
sizeof(T).made it "sufficiently clear" that such code was well-defined. (August, 2011) The reinterpret_cast produces a pointer to the first byte of the object representation of x. The issue of whether the pointer arithmetic is valid was raised again in Core Issue 1701 - pointer arithmetic requires an array, and a "sequence" is not unambiguously an array. Core Issue 1701 is still unresolved. However, I feel confident saying that in August 2011, the code above was "supposed" to be well-defined, though a small number of people disagree. CWG simply had not realized at that point that the wording was defective. In 2013, when issue 1701 was raised, they realized the wording was defective.But thanks to P0137, the above code is no longer well-defined even if you want to stretch the reading of the wording, because the result of the reinterpret_cast no longer points to the first byte of the object representation; it just points to the original int object. The pointer arithmetic cannot possibly do the right thing, and according to the plain wording, neither can the lvalue-to-rvalue conversions.
Thanks,
Timur
-- Brian Bi
-- Std-Proposals mailing listStd-Proposals_at_lists.isocpp.orghttps://lists.isocpp.org/mailman/listinfo.cgi/std-proposals-- Std-Proposals mailing listStd-Proposals_at_[hidden]://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
-------- Original message --------From: Timur Doumler via Std-Proposals <std-proposals_at_[hidden]> Date: 8/21/19 04:39 (GMT-05:00) To: std-proposals_at_[hidden] Cc: Timur Doumler <cpp_at_[hidden]> Subject: Re: [std-proposals] Allowing access to object representations And if pointer arithmetic is not allowed, does that mean that all existing implementations of functions like, say, std::memcmp, are UB?Cheers,TimurOn 21 Aug 2019, at 10:34, Timur Doumler via Std-Proposals <std-proposals_at_[hidden]> wrote:Wow, I didn’t know this.So instead of doing this (= your program, except I fixed x to &x)int x = 12345;auto p = reinterpret_cast<unsigned char*>(&x);for (int i = 0; i < sizeof(x); i++) { std::cout << p[i] << '\n';}I need to do this to avoid UB?int x = 12345;unsigned char buf[sizeof(x)];std::memcpy(&buf, &x, sizeof(x));for (auto c : buf) std::cout << c << '\n';Of course, both programs compile to the exact same code with -O3.This is interesting and I was not aware of this. If a char array can provide storage for an object, and a pointer to any element in the array can alias any other type, it is quite surprising that the first program is UB. Is it just because, as you saty, you’re not allowed to do pointer arithmetic on a pointer obtained that way, or is there another reason?Cheers,TimurOn 21 Aug 2019, at 10:22, sdkrystian via Std-Proposals <std-proposals_at_[hidden]> wrote:Using reinterpret_cast, you can access the first element, but thats about it (pointer arithmetic is UB)Sent from my Samsung Galaxy smartphone.-------- Original message --------From: Timur Doumler via Std-Proposals <std-proposals_at_[hidden]> Date: 8/21/19 02:23 (GMT-05:00) To: Brian Bi <bbi5291_at_[hidden]> Cc: Timur Doumler <cpp_at_[hidden]>, std-proposals_at_[hidden] Subject: Re: [std-proposals] Allowing access to object representations That's interesting, thanks for the explanation!So how would I, in C++17, print the bytes that make up the object representation of int x, without causing UB?It ought to be possible without memcpying, because char* can alias any other type, including int, but now I am not sure anymore how to correctly write the code that does what your snippet does.On 21 Aug 2019, at 01:38, Brian Bi <bbi5291_at_[hidden]> wrote:On Tue, Aug 20, 2019 at 8:24 AM Timur Doumler <cpp_at_[hidden]> wrote:Hi Brian,
> On 19 Aug 2019, at 23:23, Brian Bi via Std-Proposals <std-proposals_at_[hidden]> wrote:
> Exactly - that means it's still undefined. As I said in one of my earlier messages, it is undesirable to change the standard in a way that breaks lots of code which people will then not rewrite, as this erodes the legitimacy of the standard (and leaves users uncertain about what might get trampled by their compiler optimizers the next time they update.) Yet that's exactly what happened in C++17, and we should fix that.
Could you please explain exactly what change in C++17 you are referring to here?The change to the object/memory model made by P0137. At the very least, this breaks any code similar to the following:int x = 12345;auto p = reinterpret_cast<unsigned char*>(x);for (int i = 0; i < sizeof(x); i++) { std::cout << p[i] << '\n';}In Core Issue 1314, CWG said that "the current wording":The object representation of an object of type T is
the sequence of N unsigned char objects taken up by
the object of type T, where N equals
sizeof(T).made it "sufficiently clear" that such code was well-defined. (August, 2011) The reinterpret_cast produces a pointer to the first byte of the object representation of x. The issue of whether the pointer arithmetic is valid was raised again in Core Issue 1701 - pointer arithmetic requires an array, and a "sequence" is not unambiguously an array. Core Issue 1701 is still unresolved. However, I feel confident saying that in August 2011, the code above was "supposed" to be well-defined, though a small number of people disagree. CWG simply had not realized at that point that the wording was defective. In 2013, when issue 1701 was raised, they realized the wording was defective.But thanks to P0137, the above code is no longer well-defined even if you want to stretch the reading of the wording, because the result of the reinterpret_cast no longer points to the first byte of the object representation; it just points to the original int object. The pointer arithmetic cannot possibly do the right thing, and according to the plain wording, neither can the lvalue-to-rvalue conversions.
Thanks,
Timur
-- Brian Bi
-- Std-Proposals mailing listStd-Proposals_at_lists.isocpp.orghttps://lists.isocpp.org/mailman/listinfo.cgi/std-proposals-- Std-Proposals mailing listStd-Proposals_at_[hidden]://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
Received on 2019-08-21 08:03:07
