C++ Logo

std-proposals

Advanced search

Re: [std-proposals] Providing information about data structures to the compiler

From: Henning Meyer <hmeyer.eu_at_[hidden]>
Date: Sat, 15 Mar 2025 14:04:18 +0100
I am working on something similar, but approaching the problem from the
opposite direction:
Compared to a garbage collected language like C#, Java or JavaScript, we
cannot enumerate all objects in a program snapshot for debugging or
analysis. Using debug information, we can start with global variables,
local variables and thread-local variables, but we cannot reliably
follow e.g. a std::vector<std::unique_ptr<Object>> to it's owned objects.

For debuggers, that is currently worked around by hand-written XML or
Python code for each combination of container and debugger.

I think you should decompose your [[container]] idea into a range that
owns its memory:
  - For ranges, the language has a way of marking them, that is via
begin()/end() methods or free functions.
  - Your concern is about annotations for uniquely owned memory, which
is probably better served by providing aliasing information on pointers
(e.g. providing a C++ concept of __restrict__), similar to N4150.

I read about N3988 and N4150 ten years ago, and thought, sure, that
makes sense, but as far as I know, nothing ever came of it.

https://open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4150.pdf

On 15.03.25 12:07, Ehsan Amiri via Std-Proposals wrote:
> AW: [std-proposals] Providing information about data structures to the
> compiler
>
> It took me some time to get back to this, partly because I wanted to
> discuss the idea with some people in the compiler optimization
> community as well.
>
> This is step 1 and the focus here is on a [[container]] attribute. I
> provide two examples (at the end of the email). I explain what the
> attribute means and what can be guaranteed by this attribute. Once I
> get feedback on this first step and make sure it can be done in a
> reasonable way, I will continue to the next steps (as discussed before).
>
> The [[container]] attribute is added to a member variable in example
> (1) and to the class itself in example (2). In example (2) the entire
> class should be a container (like std::vector or similar data
> structures implemented in different programs). I prefer (1) because it
> is more flexible. You can have a more complex class, that among other
> things maintains a container for its internal purposes. When
> [[container]] is put on the class (as in example (2)), we need another
> attribute to specify what is the underlying memory that implements the
> container. Here I have used a [[memory]] attribute.
>
> But what does [[container]] mean and what does it guarantee? It means
> there is an underlying memory for some data structure and:
>
> 1. This underlying memory is allocated , and deallocated exclusively
> by member functions of this class.
> 2. Two different container objects have distinct underlying memory.
>
> I prefer a stronger alternative for (2):
>
> * The allocator function used for allocation of the underlying
> memory, returns a “restrict” pointer (in the same sense that
> “restrict” keyword has in C. That is any pointer that is not based
> on this pointer will not alias the underlying memory).
>
> This stronger form guarantees (2) and provides other alias analysis
> guarantees as well. This matches vast majority of how containers (both
> STL and non-STL) are used in practice, but lack of formal guarantee,
> prevents compiler from proving legality of compiler optimizations.
>
> If we adopt this stronger form, we need something like a [[allocator]]
> attribute for allocator member functions that implies the returned
> pointer is “restrict”.
>
> Example (1)
>
> template< <class T, class Allocator>
>
> class ElementsArray {
>
> protected:
>
> [[container]] T *elements;
>
> unsigned int size;
>
> unsigned int capacity;
>
> ……
>
> public:
>
> void insert(...)
>
> …….
>
> };
>
> Example (2):
>
> template< <class T, class Allocator>
>
> [[container]] class ElementsArray {
>
> protected:
>
> [[memory]] T *elements;
>
> unsigned int size;
>
> unsigned int capacity;
>
> …………..
>
> public:
>
> void insert(...)
>
> …………..
>
> };
>
> *From:* Std-Proposals <std-proposals-bounces_at_[hidden]> *On
> Behalf Of *Sebastian Wittmeier via Std-Proposals
> *Sent:* Monday, January 13, 2025 5:03 PM
> *To:* std-proposals_at_[hidden]
> *Cc:* Sebastian Wittmeier <wittmeier_at_[hidden]>
> *Subject:* Re: [std-proposals] Providing information about data
> structures to the compiler
>
> Describing the containers in this way, is high-level and low-level at
> the same time.
>
> I would start such an endeavour by creating the description of new
> (possibly similar) containers, not existing ones from the standard
> library. Than you are more flexible to change their interface and
> guarantees to fit the proposal. Only as soon as that gives a good
> result usable by the compiler, one can think about transferring those
> results to the existing standard library and third-party tools.
>
> -----Ursprüngliche Nachricht-----
> *Von:* Ehsan Amiri via Std-Proposals <std-proposals_at_[hidden]>
> *Gesendet:* Mo 13.01.2025 22:48
> *Betreff:* Re: [std-proposals] Providing information about data
> structures to the compiler
> *An:* std-proposals_at_[hidden];
> *CC:* Ehsan Amiri <ehsan.amiri_at_[hidden]>;
> Thanks for the good points you brought up.
>
> The intention here is to help prove legality of compiler
> optimizations. I think we need to focus on common cases. Most
> important things in my opinion are the following (in the order of
> importance):
>
> 1- A definition of a “container” in a way that can provide
> guarantees about the underlying memory allocated for the container
> and who can access it. (To help with alias analysis)
> 2- Specifying basic operations such as
> add/remove/alloc/realloc/dealloc/size.
> 3- Possibly specifying the type of container could help as well
> (vector, priority queue, etc.). This can provide extra guarantees
> about the behavior of the code and we have optimizations that can
> take advantage of this.
>
> You correctly mentioned: “the list of possibilities is endless”. I
> agree. The more specialized attributes, are less likely to appear
> in the code and less likely to be useful for the compiler. So we
> have to stop somewhere.
>
> I hope this can address your concerns.
>
>

Received on 2025-03-15 13:04:23