Document number |
P****R0 |
Date |
2022-09-24 |
Reply-to |
Jarrad J. Waterloo <descender76 at gmail dot com>
|
Audience |
Evolution Working Group (EWG) |
F43 direct dangling reduction
Table of contents
Abstract
“Lifetime issues with references to temporaries can lead to fatal and subtle runtime errors.”
This paper proposes fixing one such cause of dangling references and pointers.
Motivating Examples
C++ Core Guidelines
F.43: Never (directly or indirectly) return a pointer or a reference to a local object
Reason To avoid the crashes and data corruption that can result from the use of such a dangling pointer.
Example, bad After the return from a function its local objects no longer exist:
int* f()
{
int fx = 9;
return &fx;
}
void g(int* p)
{
int gx;
cout << "*p == " << *p << '\n';
*p = 999;
cout << "gx == " << gx << '\n';
}
void h()
{
int* p = f();
int z = *p;
g(p);
}
…
Fortunately, most (all?) modern compilers catch and warn against this simple case.
Note This applies to references as well:
int& f()
{
int x = 7;
return x;
}
Note This applies only to non-static local variables. All static variables are (as their name indicates) statically allocated, so that pointers to them cannot dangle.
Example, bad Not all examples of leaking a pointer to a local variable are that obvious:
Note The address of a local variable can be “returned”/leaked by a return statement, by a T&
out-parameter, as a member of a returned object, as an element of a returned array, and more.
…
Enforcement
- Compilers tend to catch return of reference to locals and could in many cases catch return of pointers to locals.
- Static analysis can catch many common patterns of the use of pointers indicating positions (thus eliminating dangling pointers)
This paper proposes that any non static local that is not a pointer or a reference can’t be returned as a pointer or a reference. This is an error not a warning.
int* return_pointer()
{
int local;
return &local;
}
int& return_reference()
{
int local;
return local;
}
This should be an error instead of a warning because there is no scenario in which it is right. It is always wrong.
As the C++ Core Guidelines pointed out: Fortunately, most (all?) modern compilers catch and warn against this simple case. As such we would just be standardizing existing practice.
This does not propose fixing indirect references to locals such as when the local is passed in as a parameter or when a member of the local is accessed as these would require access to metadata that exist outside of the function in question while with the direct reference all that is needed is the metadata that exist as part of the function compilation. For instance even accessing a member via the ->
operator can be indirect since that operator can be overloaded. Similarly, member access via the .
operator is not part of the proposal because the C++
community has not given up on being able to overload the .
operator which would make something that is direct more indirect in a possible future. Of course, nothing is preventing compilers to producing more errors for dangling but in those cases the compiler would be responsible for ensuring that it is truly an error and not a potential false positive.
If this feature is so limited why even do it?
- There is no scenario where permitting this dangling is ever correct.
- Any reduction in dangling is a win.
- This is the simplest of dangling and as such aid teaching dangling.
Let me elaborate on that last point. Do we want to teach dangling on day one, likely hour one, for a new C++
programmer? When this is an error the compiler becomes the teacher. Teaching dangling gets to be delayed some allowing the entry programmer to gain more experience before delving into the dark world of dangling. When it is time to teach dangling, we actually start with the compiler detected dangling. From that spring board, we advance for ever more complex examples of dangling and their resolutions.
Further, failing to fix even this most basic form of dangling begs the questions. Will we ever fix any dangling if we are unwilling to fix the most basic? How can one lead the C++
community if we fail to lead when it comes to problems that have been plaguing programmers for decades.
References
Jarrad J. Waterloo <descender76 at gmail dot com>
F43 direct dangling reduction
Table of contents
Abstract
“Lifetime issues with references to temporaries can lead to fatal and subtle runtime errors.” [1]
This paper proposes fixing one such cause of dangling references and pointers.
Motivating Examples
C++ Core Guidelines
F.43: Never (directly or indirectly) return a pointer or a reference to a local object [2]
Reason To avoid the crashes and data corruption that can result from the use of such a dangling pointer. [2:1]
Example, bad After the return from a function its local objects no longer exist: [2:2]
…
Fortunately, most (all?) modern compilers catch and warn against this simple case. [2:3]
Note This applies to references as well: [2:4]
Note This applies only to non-static local variables. All static variables are (as their name indicates) statically allocated, so that pointers to them cannot dangle. [2:5]
Example, bad Not all examples of leaking a pointer to a local variable are that obvious: [2:6]
Note The address of a local variable can be “returned”/leaked by a return statement, by a
T&
out-parameter, as a member of a returned object, as an element of a returned array, and more. [2:7]…
Enforcement [2:8]
This paper proposes that any non static local that is not a pointer or a reference can’t be returned as a pointer or a reference. This is an error not a warning.
This should be an error instead of a warning because there is no scenario in which it is right. It is always wrong.
As the C++ Core Guidelines pointed out: Fortunately, most (all?) modern compilers catch and warn against this simple case. [2:11] As such we would just be standardizing existing practice.
This does not propose fixing indirect references to locals such as when the local is passed in as a parameter or when a member of the local is accessed as these would require access to metadata that exist outside of the function in question while with the direct reference all that is needed is the metadata that exist as part of the function compilation. For instance even accessing a member via the
->
operator can be indirect since that operator can be overloaded. Similarly, member access via the.
operator is not part of the proposal because theC++
community has not given up on being able to overload the.
operator which would make something that is direct more indirect in a possible future. Of course, nothing is preventing compilers to producing more errors for dangling but in those cases the compiler would be responsible for ensuring that it is truly an error and not a potential false positive.If this feature is so limited why even do it?
Let me elaborate on that last point. Do we want to teach dangling on day one, likely hour one, for a new
C++
programmer? When this is an error the compiler becomes the teacher. Teaching dangling gets to be delayed some allowing the entry programmer to gain more experience before delving into the dark world of dangling. When it is time to teach dangling, we actually start with the compiler detected dangling. From that spring board, we advance for ever more complex examples of dangling and their resolutions.Further, failing to fix even this most basic form of dangling begs the questions. Will we ever fix any dangling if we are unwilling to fix the most basic? How can one lead the
C++
community if we fail to lead when it comes to problems that have been plaguing programmers for decades.References
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0936r0.pdf ↩︎
https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#f43-never-directly-or-indirectly-return-a-pointer-or-a-reference-to-a-local-object ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎