Date: Sat, 15 Mar 2025 16:35:43 +0100
Very worthwhile!
-----Ursprüngliche Nachricht-----
Von:Henning Meyer via Std-Proposals <std-proposals_at_[hidden]>
Gesendet:Sa 15.03.2025 16:28
Betreff:Re: [std-proposals] Providing information about data structures to the compiler
An:Sebastian Wittmeier via Std-Proposals <std-proposals_at_[hidden]>;
CC:Henning Meyer <hmeyer.eu_at_[hidden]>;
My focus is on data structures that can be traversed linearly via
begin() and end(), like std::vector or std::map (which is a tree
internally). These can be nested, like a std::vector<std::set<int>>, in
which case I follow them recursively. If you have a very simple
tree-like data structure
struct Node {
std::vector<Node> children;
};
that exposes its children as a range, then no special handling is necessary.
My hope is that more complicated cases can be handled via zero-cost
proxies and views.
Because all the work is in the implementation, I decided to start
working on GNU/Linux with the Itanium ABI, ELF binaries with DWARF debug
information, the clang compiler and libstdc++ standard library and
because that means I am in LLVM already, the lld linker and lldb debugger.
But the same could be done in the GNU toolchain or on other operating
systems.
The C++ part (annotating functions in libraries to allow object
discovery and type recovery) can be done in an implementation agnostic
way. I don't think I can write down a full working spec before trying
out an implementation, though. I could write a proposal for the parts
that I have figured out so far.
I am not interested in interactive code at the moment. I want to do
automated analysis of program snapshots for evidence of memory
corruption. E.g., you have a linked list and due to a race condition or
use-after-free the very last node is corrupted and will lead to
undefined behavior when used. This could lie dormant and not cause a
crash until the program actually iterates over that list until the end.
If a tool is able to recursively follow containers to discover contained
objects then we will be able to diagnose these problems (you can
determine whether a pointer points to valid memory). I think there is an
under-used opportunity beyond compile-time and run-time checks in
snapshot analysis/coredump analysis.
The necessary infrastructure for that does not exist at the moment, but
is possible with "non-virulent" annotations to existing code bases
(mostly libraries) and improvements to tooling. It would be useful for
other things, including interactive debugging.
On 15.03.25 15:44, Sebastian Wittmeier via Std-Proposals wrote:
> AW: [std-proposals] Providing information about data structures to the
> compiler
>
> Do you plan to support an interface for advanced data structures like
> trees or graphs?
>
> Or even interactive code?
>
> You are focusing on ELFs and DWARFs (dwarves?) for now? Or would the
> implementation be Unix and the attributes system independent?
>
> -----Ursprüngliche Nachricht-----
> *Von:* Henning Meyer via Std-Proposals
> <std-proposals_at_[hidden]>
> *Gesendet:* Sa 15.03.2025 15:11
> *Betreff:* Re: [std-proposals] Providing information about data
> structures to the compiler
> *An:* Sebastian Wittmeier via Std-Proposals
> <std-proposals_at_[hidden]>;
> *CC:* Henning Meyer <hmeyer.eu_at_[hidden]>;
> Functions and methods just for use by the debugger would be
> eliminated
> completely by the optimizer in production builds. You would need an
> attribute to tell the compiler to keep it around even if it is not
> used.
>
> For example, in the case of ELF binaries with DWARF debug
> information,
> you would want the compiler to emit unoptimized, non-inlined
> functions
> for begin(), end(), size(), probably to_string() as well. These
> could go
> into a new, separate .debug_text section that can be stripped from
> binaries. This requires cooperation from libraries ([[debug]]
> annotations), compiler, linker and debugger. I am working on a
> prototype
> implementation.
> If the only purpose is for code to be run in the debugger, they could
> even be emitted in DWARF opcodes instead of machine code, but for
> now I
> am lifting the generated machine code back to DWARF opcodes (for pure
> functions like std::vector begin() and end()) or run them in a VM
> (to_string() methods and functions that might allocate and free).
>
> Containers aren't so bad, std::variant and std::any require more
> effort
> (if you want to support them generically, i.e. boost and everyone
> else's
> re-implementation as well).
>
>
> On 15.03.25 14:45, Sebastian Wittmeier via Std-Proposals wrote:
> > AW: [std-proposals] Providing information about data structures
> to the
> > compiler
> >
> > What one could do besides attributes is to have functions just
> for use
> > by the debugger.
> >
> > Like an integrated debug interface.
> >
> > Anything similar already out there for C++?
> >
> > Other languages:
> >
> > Elixir
> >
> > Inspect protocol
> >
> > https://hexdocs.pm/elixir/Inspect.html
> >
> > Rust:
> >
> > #[derive(Debug)] attribute
> >
> >
> https://doc.rust-lang.org/rust-by-example/hello/print/print_debug.html
> >
> > Julia:
> >
> > overloading Base.show member function
> >
> >
> https://docs.julialang.org/en/v1/manual/types/#man-custom-pretty-printing
> >
> > C#
> >
> > DebuggerDisplay attribute
> >
> >
> https://learn.microsoft.com/en-us/visualstudio/debugger/using-the-debuggerdisplay-attribute?view=vs-2022
> >
> > Python:
> >
> > __repr__ member function
> >
> > https://docs.python.org/3/library/functions.html#repr
> >
> > -----Ursprüngliche Nachricht-----
> > *Von:* Henning Meyer via Std-Proposals
> > <std-proposals_at_[hidden]>
> > - For ranges, the language has a way of marking them, that
> is via
> > begin()/end() methods or free functions.
> >
> >
> --
> Std-Proposals mailing list
> Std-Proposals_at_[hidden]
> https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
>
>
--
Std-Proposals mailing list
Std-Proposals_at_[hidden]
https://lists.isocpp.org/mailman/listinfo.cgi/std-proposals
Received on 2025-03-15 15:40:57