Date: Thu, 25 Mar 2021 22:18:05 -0400
The following is based on this Stack Overflow question:
https://stackoverflow.com/questions/66755218. I will expand on why the
obvious solutions to this question are not adequate, and why the
wording in the standard may be deficient.
Given a type `T` which is *not* an implicit lifetime type, consider
the following code:
```
void *ptr = malloc(sizeof(T) * 10);
T *arr = reinterpret_cast<T*>(ptr);
for(int i = 0; i < 5; ++i)
{
new(arr) T(i);
++arr;
}
T *last = arr;
T *end = arr + 5;
```
We know what this wants to do; it's standard `vector<T>` work. We
create storage for 10 `T`s, initialize 5 of them, and store pointers
to the non-existent 6th element and a pointer past the end of the full
array of 10 elements.
In C++17, every instance of pointer arithmetic in this code is
undefined behavior. This is true because there is no array of `T` in
that storage, and pointer arithmetic only works on the basis of arrays
of `T`s.
C++20's implicit object creation rules (IOC) is intended to correct
this. It does this by allowing `malloc` to implicitly create any
implicit lifetime type objects needed to make the code work. In our
case, we need an array of `T`s, and arrays are implicit lifetime
types, so `malloc` creates that array. But since `T` is not implicit
lifetime itself, it doesn't create the `T` subobjects *within* that
array.
The problem is this: what does `malloc` return?
We know what `malloc`, under IOC rules, *does*. It allocates storage
and creates an array of 10 `T`s within that storage.
But what does `malloc` return?
According to the standard, it returns a "pointer to a suitable created
object". But the only object that was created was a `T[10]`, not the
`T`s themselves.
Since this is the only object which is implicitly created in this
scenario, that means that `malloc` returns a pointer, of type `void`,
which points to an array of 10 `T`s. The actual type of the object
being pointed to is `T(*)[10]`.
This would all be sophistry if not for one tiny problem: pointers to
arrays are *not* pointer-interconvertible to pointers to their first
elements. So what we have after the cast is a pointer, of type `T`,
which points to an array of 10 `T`s. I don't want to get into the
question of what it means to create an object in such a pointer, as
there is a more fundamental problem.
The above code *definitely* exhibits UB when it attempts to do pointer
arithmetic on `arr`. You do not have a pointer to an element in the
array; you only have a pointer to the array *itself*. And doing
pointer arithmetic on a broken pointer doesn't work.
Note that not even `std::launder<T>` would work, because `launder`
requires that a `T` must exist at that address, and it *doesn't* yet.
And even if it did, it's not clear that creating a `T` at that address
would constitute creating a subobject of an existing array element.
After all, you didn't start with a pointer into the array; it was a
pointer *to* the array.
It's not clear if you could even resolve this by doing the following:
```
auto *arr_ptr = reinterpret_cast<T(*)[10]>(ptr);
T *arr = &(*arr_ptr[0]);
```
This might be considered accessing the first element of the array,
which doesn't exist.
Am I missing something from the standard, or is this a defect in the
IOC wording?
Note that using `std::allocator<T>::allocate` works by fiat; the
function is defined to return a pointer, not to the array, but to the
first element of the array. As such, if this is a defect, I would
suggest we use this as the template for how to fix it (rather than
making array-pointers pointer-interconvertible with pointers to the
first element). Essentially, if the IOC object created at the start of
the storage is an array, the "suitably created object" pointer can be
a pointer to the first element of the array rather than the array
itself. This should be true even when no object of that type was
implicitly created.
That is, the "suitably created object" doesn't have to point to an
object that was created.
https://stackoverflow.com/questions/66755218. I will expand on why the
obvious solutions to this question are not adequate, and why the
wording in the standard may be deficient.
Given a type `T` which is *not* an implicit lifetime type, consider
the following code:
```
void *ptr = malloc(sizeof(T) * 10);
T *arr = reinterpret_cast<T*>(ptr);
for(int i = 0; i < 5; ++i)
{
new(arr) T(i);
++arr;
}
T *last = arr;
T *end = arr + 5;
```
We know what this wants to do; it's standard `vector<T>` work. We
create storage for 10 `T`s, initialize 5 of them, and store pointers
to the non-existent 6th element and a pointer past the end of the full
array of 10 elements.
In C++17, every instance of pointer arithmetic in this code is
undefined behavior. This is true because there is no array of `T` in
that storage, and pointer arithmetic only works on the basis of arrays
of `T`s.
C++20's implicit object creation rules (IOC) is intended to correct
this. It does this by allowing `malloc` to implicitly create any
implicit lifetime type objects needed to make the code work. In our
case, we need an array of `T`s, and arrays are implicit lifetime
types, so `malloc` creates that array. But since `T` is not implicit
lifetime itself, it doesn't create the `T` subobjects *within* that
array.
The problem is this: what does `malloc` return?
We know what `malloc`, under IOC rules, *does*. It allocates storage
and creates an array of 10 `T`s within that storage.
But what does `malloc` return?
According to the standard, it returns a "pointer to a suitable created
object". But the only object that was created was a `T[10]`, not the
`T`s themselves.
Since this is the only object which is implicitly created in this
scenario, that means that `malloc` returns a pointer, of type `void`,
which points to an array of 10 `T`s. The actual type of the object
being pointed to is `T(*)[10]`.
This would all be sophistry if not for one tiny problem: pointers to
arrays are *not* pointer-interconvertible to pointers to their first
elements. So what we have after the cast is a pointer, of type `T`,
which points to an array of 10 `T`s. I don't want to get into the
question of what it means to create an object in such a pointer, as
there is a more fundamental problem.
The above code *definitely* exhibits UB when it attempts to do pointer
arithmetic on `arr`. You do not have a pointer to an element in the
array; you only have a pointer to the array *itself*. And doing
pointer arithmetic on a broken pointer doesn't work.
Note that not even `std::launder<T>` would work, because `launder`
requires that a `T` must exist at that address, and it *doesn't* yet.
And even if it did, it's not clear that creating a `T` at that address
would constitute creating a subobject of an existing array element.
After all, you didn't start with a pointer into the array; it was a
pointer *to* the array.
It's not clear if you could even resolve this by doing the following:
```
auto *arr_ptr = reinterpret_cast<T(*)[10]>(ptr);
T *arr = &(*arr_ptr[0]);
```
This might be considered accessing the first element of the array,
which doesn't exist.
Am I missing something from the standard, or is this a defect in the
IOC wording?
Note that using `std::allocator<T>::allocate` works by fiat; the
function is defined to return a pointer, not to the array, but to the
first element of the array. As such, if this is a defect, I would
suggest we use this as the template for how to fix it (rather than
making array-pointers pointer-interconvertible with pointers to the
first element). Essentially, if the IOC object created at the start of
the storage is an array, the "suitably created object" pointer can be
a pointer to the first element of the array rather than the array
itself. This should be true even when no object of that type was
implicitly created.
That is, the "suitably created object" doesn't have to point to an
object that was created.
Received on 2021-03-25 21:18:19