Hi Thomas,

On Sun, Jun 1, 2025 at 10:32 AM Thomas Krogh Lohse via Std-Proposals <std-proposals@lists.isocpp.org> wrote:
Dear all,

I’ve just submitted my master’s thesis in Software Engineering from Aalborg University (defending it on June 6), which focuses on memory safety in C++, and I’d like to briefly share the core idea of my project.

The project defines a conservative safe subset of C++, and applies two static dataflow analyses:
    * A lifetime analysis to detect use-after-free, use-after-move, and similar issues.
    * A borrow checker-style analysis to ensure mutually exclusive access to resources.

The safe subset is inspired by Rust and restricts some inherently unsafe constructs:
    * Pointer dereferencing
    * `new` / `delete`
    * `reinterpret_cast`, `const_cast`, and C-style casts
    * Union field access
    * Labels and `goto`

This sounds mostly like the same basic premise as the paper P3700 that I wrote with the intent of seeing it in WG21 next June in Sofia. It proposes a grid approach to looking at how to make C++ safer, because the whole problem seems too large to be tractible by any one paper (and if such a paper were produced with sufficient detail, no human would ever be able to, or want to read it). It proposes to split the safety problem into rows (the tools that we need to tackle safety), columns (the areas of safety that we want to tackle) and cells (specific things we should do). Two rows that I have in there are "removing features from the language that we know are bad" and "adding annotations to enable separate static analysis". But please do refer to the full paper at http://wg21.link/p3700 .

I've tried to work out the subsetting into P3716 (http://wg21.link/p3716) with the intent of having a way to exclude specific constructs in a standard and portable way.

The proposal you outlined needs subsetting of some sort, plus added lifetime annotations, with an addition of specific checks to be implemented within compilers that check lifetimes and access to resources.

The specific subsets you are proposing seem like an odd set. Disallowing reinterpret_cast, const_cast and C-style casts seems eminently possible. Removing goto and labels makes the language cleaner but isn't necessarily directly related to safety. Removing union field access is restricting one kind of UB. Removing new and delete forces people to use containers or make_unique/make_shared, which is a clear win, with the downside of making it impossible to implement make_shared, make_unique etc. themselves. Then you also disallow pointer dereferencing, which seems like a huge impact on the language, with impact beyond what I can see.

I'd love to see a paper working out in detail why we want to subset out these specific bits.
 
Instead, developers are encouraged (by the language) to use smart pointers for ownership and lvalue-references for borrowing, promoting RAII by default.

The analyses are implemented in a proof-of-concept Clang plugin. Users can annotate types and functions with attributes (e.g., to define smart pointer behavior or skip analysis — similar to Rust's unsafe). It’s still a prototype and has some scalability and precision limitations, but it successfully enforces the subset and detects key violations. The implementation uses Andersen’s pointer analysis.

Currently, the analysis does not handle polymorphism, exceptions, or lambdas, though I outline ideas for addressing these in future work.

If refined with a more precise pointer analysis, some over-approximation fixes, and extended support for more of C++, I believe this approach could provide safety guarantees similar to Rust — but within standard, modern C++, without requiring a new frontend or language changes.

I’d love to hear your thoughts:
    * Do you see value in defining a "safe-by-default" C++ subset with opt-in unsafe features?
    * Could something like this analysis model help enforce safety in future directions for the language?

I see a future for this direction but it needs more rationale, and more details on how it works. In particular, we need to understand how pointer and object lifetimes are passed along, without only tackling the easy side of the problem.

The #1 result I think/hope we should get from this is a compiler's blessing of much code, allowing developers to know that 90% of their code is safe and checked, while allowing the remaining 10% to be either fixed or reduced. It should at the very least make the 90%-C++ be recognized as a safe language in the same way that Rust, Java etc. are, as at least that much code is verifiably in no risk whatsoever of going out of bounds. Excluding polymorphism, exceptions or lambdas makes this drop below 90% for nearly all code bases.