On 6/13/21 10:57 AM, Ville Voutilainen via Std-Proposals wrote:
On Sun, 13 Jun 2021 at 16:07, Tom Honermann via Std-Proposals
<std-proposals@lists.isocpp.org> wrote:
Compilers could warn on use of a poisoned value subject to their analysis limitations.  Consider this example:

void f(bool b1, bool b2) {
  int *x = POISON(nullptr);
  if (b1) {
    x = find_x();
  }
  if (b2) {
    use_x(*x);
  }
}

A defect is present if f() is called with b1 false and b2 true.  A compiler could warn about the existence of an execution path that results in use of the poisoned value, but such warnings will be false positives (FPs) if f() has a contract that b2 may only be true if b1 is also true.  Static analysis tools can do better by examining an entire program to determine if there is a call to f() with b1 false and b2 true, but such analysis is not necessarily conclusive since the values of b1 and b2 may derive from run-time input; in which case, issuing warnings will result in some FPs, and not issuing them will result in some false negatives (FNs).  A run-time tool can potentially perfectly identify such defects by tracking undefined/poisoned values and observing their use.

I think Edward had it right in his response about desiring potentially distinct behavior for secure, debug, and release compilation modes.
Intriguing. That makes this kinda like a contract assertion, but
instead of saying "if we come here, the value of x must be so-and-so",
it's saying "wherever we go, value of x must be so-and-so", and more
specifically, "wherever we go, if x is used, its value must
not be this value". There might be a general notion lurking here.

I think this is a special case of taint analysis and I agree this may be generalized further.

An example I've had in mind of a more general taint analysis facility looks something like this:

bool f(FILE *fp) {
  char buffer[BUF_SIZE];
  size_t s = fread(buffer, 1, BUF_SIZE, fp);
  if (s < 2) return false;
  // Mark the buffer as tainted until its contents are validated.
  TAINT(buffer, 0, s);
  // Oops, a tainted value was accessed, I'd like a diagnostic please.
  if (buffer[0] == 'X') { ... }
  // Now validate the contents.
  ALLOW_TAINT_INSPECTION {
    // Ok, buffer[0] is tainted, but we're in a sanitization context; no diagnostics please.
    switch (buffer[0]) {
      case 'X':
        // Ok, an expected value.
        SANITIZE(buffer, 0, 1);
      case 'Y':
        // Ok an expected value, map to X, implicitly sanitized.
        buffer[0] = 'Y';
      default:
        // An unexpected value, fail.
        return false;
    }
  }
  // Ok, buffer[0] is no longer tainted; no diagnostic please.
  if (buffer[0] == 'X') { ... }
  // Oops, buffer[1] is still tainted, I'd like a diagnostic please.
  if (buffer[1] == 'X') { ... }
  ...
}

I don't have a good syntax in mind for TAINT, ALLOW_TAINT_INSPECTION, or SANITIZE, nor do I think those are necessarily the right primitives.

I strongly suspect that there are excellent papers available on the subject of taint analysis, but I'm a novice on the subject and do not currently have any to recommend.  I suspect I have colleagues that do though, so I'll ask around.

Tom.