std-proposals: Re: Poisoned initializers

From: Tom Honermann <tom_at_[hidden]>
Date: Sun, 13 Jun 2021 12:02:19 -0400

On 6/13/21 10:57 AM, Ville Voutilainen via Std-Proposals wrote:
> On Sun, 13 Jun 2021 at 16:07, Tom Honermann via Std-Proposals
> <std-proposals_at_[hidden]> wrote:
>> Compilers could warn on use of a poisoned value subject to their analysis limitations. Consider this example:
>>
>> void f(bool b1, bool b2) {
>> int *x = POISON(nullptr);
>> if (b1) {
>> x = find_x();
>> }
>> if (b2) {
>> use_x(*x);
>> }
>> }
>>
>> A defect is present if f() is called with b1 false and b2 true. A compiler could warn about the existence of an execution path that results in use of the poisoned value, but such warnings will be false positives (FPs) if f() has a contract that b2 may only be true if b1 is also true. Static analysis tools can do better by examining an entire program to determine if there is a call to f() with b1 false and b2 true, but such analysis is not necessarily conclusive since the values of b1 and b2 may derive from run-time input; in which case, issuing warnings will result in some FPs, and not issuing them will result in some false negatives (FNs). A run-time tool can potentially perfectly identify such defects by tracking undefined/poisoned values and observing their use.
>>
>> I think Edward had it right in his response about desiring potentially distinct behavior for secure, debug, and release compilation modes.
> Intriguing. That makes this kinda like a contract assertion, but
> instead of saying "if we come here, the value of x must be so-and-so",
> it's saying "wherever we go, value of x must be so-and-so", and more
> specifically, "wherever we go, if x is used, its value must
> not be this value". There might be a general notion lurking here.

I think this is a special case of taint analysis and I agree this may be
generalized further.

An example I've had in mind of a more general taint analysis facility
looks something like this:

    bool f(FILE *fp) {
       char buffer[BUF_SIZE];
       size_t s = fread(buffer, 1, BUF_SIZE, fp);
       if (s < 2) return false;
       // Mark the buffer as tainted until its contents are validated.
       TAINT(buffer, 0, s);
       // Oops, a tainted value was accessed, I'd like a diagnostic please.
       if (buffer[0] == 'X') { ... }
       // Now validate the contents.
       ALLOW_TAINT_INSPECTION {
         // Ok, buffer[0] is tainted, but we're in a sanitization
    context; no diagnostics please.
         switch (buffer[0]) {
           case 'X':
             // Ok, an expected value.
             SANITIZE(buffer, 0, 1);
           case 'Y':
             // Ok an expected value, map to X, implicitly sanitized.
             buffer[0] = 'Y';
           default:
             // An unexpected value, fail.
             return false;
         }
       }
       // Ok, buffer[0] is no longer tainted; no diagnostic please.
       if (buffer[0] == 'X') { ... }
       // Oops, buffer[1] is still tainted, I'd like a diagnostic please.
       if (buffer[1] == 'X') { ... }
       ...
    }

I don't have a good syntax in mind for TAINT, ALLOW_TAINT_INSPECTION, or
SANITIZE, nor do I think those are necessarily the right primitives.

I strongly suspect that there are excellent papers available on the
subject of taint analysis, but I'm a novice on the subject and do not
currently have any to recommend. I suspect I have colleagues that do
though, so I'll ask around.

Tom.

Received on 2021-06-13 11:02:23