std-proposals: Poisoned initializers

From: Tom Honermann <tom_at_[hidden]>
Date: Sat, 12 Jun 2021 01:03:41 -0400

Accessing uninitialized variables or data members is a well known source
of undefined behavior (UB). The fact that such accesses result in UB is
leveraged by tools like MemorySanitizer
<https://clang.llvm.org/docs/MemorySanitizer.html> to diagnose such
accesses. Programmers that use these tools may resist adding
initializers when a default value is not desirable since doing so
prevents diagnosing an unintended access before an appropriate value has
been determined and assigned. This results in undesirable compiler
warnings or static analysis complaints about uninitialized variables or
data members (undesirable because the omission is intentional). Worse,
if such an access is not diagnosed and corrected, the program will
exhibit UB at run-time.

This situation could perhaps be improved by allowing an initializer to
be "poisoned". The idea is that, when code is compiled for use with a
tool like MemorySanitizer, the initializer is effectively discarded or
the assigned object value otherwise marked as invalid so as to enable
diagnosing an invalid access. But when compiled normally, the provided
initializer is used so as to avoid UB and ensure consistent behavior at
run-time (though that behavior may still be wrong according to
programmer intentions).

For example:

    struct S {
       S() {}
       S(int v) : dm(v) {}

       // dm may only be used if a value was provided during construction.
       int dm = POISON(-1);
    };

    int f(bool b) {
       S s = b ? S() : S(42);
       // The following access of s.dm violates preconditions if b is true.
       // Would like MemorySanitizer to diagnose as an access of an
    uninitialized
       // data member. But if MemorySanitizer is not enabled, then
    would like to
       // reliably return -1.
       return s.dm;
    }

Implementation would presumably require POISON() to be implemented via a
keyword or intrinsic function. Perhaps there is a clever way to
implement it as a library function.

Worth pursuing? Prior art?

Tom.

Received on 2021-06-12 00:03:45