C++ Logo

std-proposals

Advanced search

Re: [std-proposals] promote T[] and deprecate delete[]

From: Henning Meyer <hmeyer.eu_at_[hidden]>
Date: Wed, 2 Jul 2025 17:03:44 +0200
Hi David, thanks for your idea.

What I want are C++ programs that are checked for correctness at build
time instead of running into undefined behavior at run time.
So there will be a static analyzer (using the compiler front-end and
middle-end) that checks for pointer misuse, it does that by keeping
track of which allocations come from new[], where do they go, and where
do they end up (free/delete/delete[]/leaked).
You cannot reason about correctness of C++ programs without tracking
pointer provenance.
The distinction between memory allocated by new and new[] really is a
different type, and we have a very rich type system, that happens to
have a blind spot when it comes to this difference. And everyone deals
with that by layering complexity onto complexity.

Your suggestion represents the state of the art, which is everyone
creates a custom wrapper that hides the ugly bits and creates custom
guidelines to not use language features and use custom library types
instead.

I am fully aware that every change to basic types will break millions of
lines of code, because there are billions of lines of C++ out there.
There are also hundreds of little changes that would improve the core
language, not just this one.

If I want this change I probably have to go down the route of circle and
cpp2 and fork the language (I won't, too much work).

On 02.07.25 10:34, David Brown wrote:
> On 01/07/2025 17:26, Henning Meyer via Std-Proposals wrote:
>> I think there was an (unavoidable) missed opportunity in C++98 when
>> new[] and delete[] were introduced.
>>
>> What we have currently is unchanged since 1998 and very C:
>>
>> new std::string[8] returns an object of type std::string*. It is
>> indistinguishable in its type from the result of new std::string.
>>
>> When you free it, you must remember that it was allocated via new[]
>> and pass it into delete[] instead of delete.
>> Calling delete instead of delete[] is undefined behavior and may lead
>> to crashes in practice not just memory leaks.
>>
>> Instead, we could have the following:
>>
>> new T[n] returns an object of type T[],
>>
>> we can declare variables of type T[], they have a representation
>> identical to T*
>>
>> objects of type T[] decay to T* similar to array decay,
>>
>> delete p has the behavior of delete[] when p is of type T[].
>>
>> This would represent the difference in the type system and not just
>> in the logic within functions.
>>
>>
>> I think the state of T[] is very odd in the current language:
>>
>> variables cannot be declared:
>>
>> int p[]; // will not compile
>>
>> struct members can be declared, but this is C (flexible array
>> members) and not allowed in strict C++
>>
>> struct S {
>>
>> int p[];
>>
>> };
>>
>> There are headers written in C that use use this syntax, and these
>> won't change to not break existing code.
>>
>> Function parameters can be declared, but is no different from
>> declaring a pointer
>>
>> void fun(int p[]); is the same as void fun(int* p);
>>
>> As far as I can tell, T[] in C++ is mostly used in specializations of
>> templates like std::unique_ptr<T[]> which is essentially syntactic
>> sugar over std::unique_ptr<T,array_deleter<T>>, as an array without
>> bound T[] cannot be meaningfully used in the current language.
>>
>> I think the C++ language rules can be amended to allow T[] to
>> represent T* allocated by new[] and backwards compatibility with C
>> headers can preserved by disallowing this construct within extern "C"
>> constructs.
>>
>> Of course, it is easy to imagine generic C++ code that breaks when
>> the expression new[] returns a type that decays to T* instead of T*.
>> Whether that is relevant in practice can only be determined by
>> implementing the proposed changes in a compiler.
>>
>> I just thought I ask whether I am the only one who thinks this might
>> be a good idea before (asking for help) implementing this in a branch
>> of GCC or LLVM.
>>
>> Regards,
>> Henning
>>
>
> As others have pointed out, making T[] distinct from T* would be a
> /massive/ change to the way the fundamental types in C++ work. It is
> not something that can be shoehorned into the language now. It is not
> something that could be changed just for improving delete (especially
> since we are now not supposed to use naked new and delete much, and of
> preference use containers and smart pointers).
>
> I have a suggestion of an alternative idea that would be much less
> intrusive, and might be feasible.
>
> When you use "new T" or "new T[10]", the low-level allocation
> functions make space on the heap for the type or array, and also
> somehow record information about the size of the actual allocation
> (which might be rounded up, such as for cache line alignment) and the
> number of elements in an array new allocation. Traditionally, C
> malloc/free systems did this by allocating a size_t worth of space
> more than you asked for, storing the allocation size in that size_t,
> and returning a pointer just after that size_t for the user data.
> Current C++ implementations can do something similar, or they can
> store the information elsewhere in some form. And they don't need to
> store information if they can calculate it later or will never need
> it. (The count of a new array is only really needed if the type has a
> non-trivial destructor.)
>
> So we can pretend that when you write "auto p = new T[10];", as well
> as getting back a T* point in p, the compiler has "magic" functions :
>
> size_t __real_allocation_size(T* p);
> size_t __array_count(T* p);
>
> How these "magic" functions are implemented is entirely
> implementation-dependent, but logical equivalences of these must exist
> for the current "delete" mechanism to work.
>
>
> My suggestion then is to introduce a new container type,
> std::dynarray<T>. This will always be an incomplete type, so you
> cannot have local or statically allocated instances of it, or return
> it from a function - mostly you will use pointers to the type. It
> will have the same interface as std::array<T, N> (including,
> crucially, the "data" member). But the size of the array is no longer
> a constant part of the type - it is now returned by __array_count(p)
> where "p" is a pointer to the dynarray<>.
>
>
> Now instead of using "auto p = new T[10];" then "delete[] p;", you can
> write "auto p = new std::dynarray<T>(10);", then "delete p;". The
> pointer to the dynarray can be safely passed around to functions, and
> used like a pointer to a container - it will be much like a
> non-resizeable vector but would have the same efficiency and overhead
> as a C-style array allocated on the heap with "new T[n]".
>
> Implementation could not be pure C++, as it needs the magic
> "__array_count" function.
>
>
> I don't know if that idea would suit your needs, but it might be a
> compromise between what you want, and something that has at least a
> vague hope of being implementable!
>
> David
>

Received on 2025-07-02 15:03:51