Date: Sat, 27 May 2023 16:16:35 -0400
On Sat, May 27, 2023 at 3:56 PM Federico Kircheis via Std-Discussion
<std-discussion_at_[hidden]> wrote:
>
> On 27/05/2023 16.49, Matthew House via Std-Discussion wrote:
> > On Sat, May 27, 2023 at 12:20 PM Jens Maurer via Std-Discussion
> > <std-discussion_at_[hidden]> wrote:
> >> On 27/05/2023 10.16, Federico Kircheis via Std-Discussion wrote:
> >>> On 27/05/2023 07.36, Federico Kircheis via Std-Discussion wrote:
> >>>> On 26/05/2023 23.23, Jens Maurer wrote:
> >>>>>
> >>>>> On 26/05/2023 21.30, Federico Kircheis via Std-Discussion wrote:
> >>>>>> I know that it is not possible to create objects out of thins air:
> >>>>>>
> >>>>>> int i = 42;
> >>>>>> char buffer[sizeof(int)];
> >>>>>> std::memcpy(buffer, &i, sizeof(int));
> >>>>>>
> >>>>>> int* j = reinterpret_cast<int*>(buffer);
> >>>>>> *j; // UB, even if *j == i, as j does not point to an int object
> >>>>>
> >>>>> This is not accurate since implicit object creation has been introduced.
> >>>>> Please read [intro.object] in its entirety.
> >>>>
> >>>> Mhm, I'll check again...
> >>>>
> >>>
> >>>
> >>> I think you where implying here that memcpy implicitly creates an object
> >>> of type int, and thus *j is valid.
> >>> Is that correct?
> >>
> >> No.
> >>
> >> [intro.object] p13 says that starting the lifetime of an array of unsigned
> >> char implicitly creates objects (oops, yours is an array of char, so that
> >> doesn't work), and memcpy then just copies the object representation
> >> into the already-existing object per [basic.types.general] p3.
> >>
> >>> if i was const, would then memcpy create implicitly a const int object?
> >>
> >> Since memcpy doesn't do anything special, "const int i" is fine, too.
> >>
> >> Jens
> >
> > std::memcpy does do something special, in that it implicitly creates objects
> > in its destination region ([cstring.syn] p3). Since 'buffer' is a char array
> > instead of an unsigned char array, it cannot provide storage for any created
> > objects, and starting the lifetime of an int object would end the lifetime
> > of the char array object. However, this does not prevent laundering its
> > pointer as an int pointer (as below), since a declared object with automatic
> > storage duration only has to live to the end of its block if its type has a
> > non-trivial destructor ([basic.life] p9).
> >
> > As language.lawyer points out, the snippet as written results in UB,
> > regardless of whether 'buffer' is a char or unsigned char array. The
> > lifetime of an int object can be started at the first address, but 'buffer'
> > (after an array-to-pointer conversion) still points to the first char
> > object in the array, rather than the int object. Since char is not similar
> > to int, 'j' still points to the first char object after the reinterpret_cast
> > ([expr.static.cast] p14), and reading an int from it is UB
> > ([basic.lval] p11).
> >
> > To access the created int object, we would have to use std::launder:
> >
> > int i = 42;
> > unsigned char buffer[sizeof(int)];
> > std::memcpy(buffer, &i, sizeof(int));
> >
> > int* j = std::launder(reinterpret_cast<int*>(buffer));
> > *j;
> >
> > And to elaborate on the const int question, C++ isn't like C, where memcpy
> > sets the effective type of the destination to the effective type of the
> > source. Instead, operations like std::memcpy that implicitly create objects
> > will use whatever types are expected by subsequent use of the destination
> > region, as long as they are implicit-lifetime types. In this case, declaring
> > 'buffer' also implicitly creates objects, so the int object could have been
> > created by either statement.
>
> Ok, I double-checked which types are eligible for implicit lifetime, and
> aggregates are.
>
> If I am not wrong
>
> struct s{
> int a;
> std::string str;
> };
>
> is an aggregate according to [dcl.init.aggr] and thus eligible for
> implicit lifetime
>
> If not (why?), ignore the follow-up question.
>
> const s v = s{1, "a very long string to avoid SSO"};
> unsigned char buffer[sizeof(s)];
> std::memcpy(buffer, &v, sizeof(s));
> s* v2 = std::launder(reinterpret_cast<s*>(buffer));
> v2->str[0] = 'a';
The UB started when you did a `memcpy` of a type which is not
trivially copyable. Just because there is an object of a certain type
there does not mean that you get to copy bits into it and get valid
behavior. `std::string` is not trivially copyable and therefore the
affordances provided by [basic.types]/2 do not apply.
<std-discussion_at_[hidden]> wrote:
>
> On 27/05/2023 16.49, Matthew House via Std-Discussion wrote:
> > On Sat, May 27, 2023 at 12:20 PM Jens Maurer via Std-Discussion
> > <std-discussion_at_[hidden]> wrote:
> >> On 27/05/2023 10.16, Federico Kircheis via Std-Discussion wrote:
> >>> On 27/05/2023 07.36, Federico Kircheis via Std-Discussion wrote:
> >>>> On 26/05/2023 23.23, Jens Maurer wrote:
> >>>>>
> >>>>> On 26/05/2023 21.30, Federico Kircheis via Std-Discussion wrote:
> >>>>>> I know that it is not possible to create objects out of thins air:
> >>>>>>
> >>>>>> int i = 42;
> >>>>>> char buffer[sizeof(int)];
> >>>>>> std::memcpy(buffer, &i, sizeof(int));
> >>>>>>
> >>>>>> int* j = reinterpret_cast<int*>(buffer);
> >>>>>> *j; // UB, even if *j == i, as j does not point to an int object
> >>>>>
> >>>>> This is not accurate since implicit object creation has been introduced.
> >>>>> Please read [intro.object] in its entirety.
> >>>>
> >>>> Mhm, I'll check again...
> >>>>
> >>>
> >>>
> >>> I think you where implying here that memcpy implicitly creates an object
> >>> of type int, and thus *j is valid.
> >>> Is that correct?
> >>
> >> No.
> >>
> >> [intro.object] p13 says that starting the lifetime of an array of unsigned
> >> char implicitly creates objects (oops, yours is an array of char, so that
> >> doesn't work), and memcpy then just copies the object representation
> >> into the already-existing object per [basic.types.general] p3.
> >>
> >>> if i was const, would then memcpy create implicitly a const int object?
> >>
> >> Since memcpy doesn't do anything special, "const int i" is fine, too.
> >>
> >> Jens
> >
> > std::memcpy does do something special, in that it implicitly creates objects
> > in its destination region ([cstring.syn] p3). Since 'buffer' is a char array
> > instead of an unsigned char array, it cannot provide storage for any created
> > objects, and starting the lifetime of an int object would end the lifetime
> > of the char array object. However, this does not prevent laundering its
> > pointer as an int pointer (as below), since a declared object with automatic
> > storage duration only has to live to the end of its block if its type has a
> > non-trivial destructor ([basic.life] p9).
> >
> > As language.lawyer points out, the snippet as written results in UB,
> > regardless of whether 'buffer' is a char or unsigned char array. The
> > lifetime of an int object can be started at the first address, but 'buffer'
> > (after an array-to-pointer conversion) still points to the first char
> > object in the array, rather than the int object. Since char is not similar
> > to int, 'j' still points to the first char object after the reinterpret_cast
> > ([expr.static.cast] p14), and reading an int from it is UB
> > ([basic.lval] p11).
> >
> > To access the created int object, we would have to use std::launder:
> >
> > int i = 42;
> > unsigned char buffer[sizeof(int)];
> > std::memcpy(buffer, &i, sizeof(int));
> >
> > int* j = std::launder(reinterpret_cast<int*>(buffer));
> > *j;
> >
> > And to elaborate on the const int question, C++ isn't like C, where memcpy
> > sets the effective type of the destination to the effective type of the
> > source. Instead, operations like std::memcpy that implicitly create objects
> > will use whatever types are expected by subsequent use of the destination
> > region, as long as they are implicit-lifetime types. In this case, declaring
> > 'buffer' also implicitly creates objects, so the int object could have been
> > created by either statement.
>
> Ok, I double-checked which types are eligible for implicit lifetime, and
> aggregates are.
>
> If I am not wrong
>
> struct s{
> int a;
> std::string str;
> };
>
> is an aggregate according to [dcl.init.aggr] and thus eligible for
> implicit lifetime
>
> If not (why?), ignore the follow-up question.
>
> const s v = s{1, "a very long string to avoid SSO"};
> unsigned char buffer[sizeof(s)];
> std::memcpy(buffer, &v, sizeof(s));
> s* v2 = std::launder(reinterpret_cast<s*>(buffer));
> v2->str[0] = 'a';
The UB started when you did a `memcpy` of a type which is not
trivially copyable. Just because there is an object of a certain type
there does not mean that you get to copy bits into it and get valid
behavior. `std::string` is not trivially copyable and therefore the
affordances provided by [basic.types]/2 do not apply.
Received on 2023-05-27 20:16:48