Date: Sun, 13 Jun 2021 12:02:19 -0400
On 6/13/21 10:57 AM, Ville Voutilainen via Std-Proposals wrote:
> On Sun, 13 Jun 2021 at 16:07, Tom Honermann via Std-Proposals
> <std-proposals_at_[hidden]> wrote:
>> Compilers could warn on use of a poisoned value subject to their analysis limitations. Consider this example:
>>
>> void f(bool b1, bool b2) {
>> int *x = POISON(nullptr);
>> if (b1) {
>> x = find_x();
>> }
>> if (b2) {
>> use_x(*x);
>> }
>> }
>>
>> A defect is present if f() is called with b1 false and b2 true. A compiler could warn about the existence of an execution path that results in use of the poisoned value, but such warnings will be false positives (FPs) if f() has a contract that b2 may only be true if b1 is also true. Static analysis tools can do better by examining an entire program to determine if there is a call to f() with b1 false and b2 true, but such analysis is not necessarily conclusive since the values of b1 and b2 may derive from run-time input; in which case, issuing warnings will result in some FPs, and not issuing them will result in some false negatives (FNs). A run-time tool can potentially perfectly identify such defects by tracking undefined/poisoned values and observing their use.
>>
>> I think Edward had it right in his response about desiring potentially distinct behavior for secure, debug, and release compilation modes.
> Intriguing. That makes this kinda like a contract assertion, but
> instead of saying "if we come here, the value of x must be so-and-so",
> it's saying "wherever we go, value of x must be so-and-so", and more
> specifically, "wherever we go, if x is used, its value must
> not be this value". There might be a general notion lurking here.
I think this is a special case of taint analysis and I agree this may be
generalized further.
An example I've had in mind of a more general taint analysis facility
looks something like this:
bool f(FILE *fp) {
char buffer[BUF_SIZE];
size_t s = fread(buffer, 1, BUF_SIZE, fp);
if (s < 2) return false;
// Mark the buffer as tainted until its contents are validated.
TAINT(buffer, 0, s);
// Oops, a tainted value was accessed, I'd like a diagnostic please.
if (buffer[0] == 'X') { ... }
// Now validate the contents.
ALLOW_TAINT_INSPECTION {
// Ok, buffer[0] is tainted, but we're in a sanitization
context; no diagnostics please.
switch (buffer[0]) {
case 'X':
// Ok, an expected value.
SANITIZE(buffer, 0, 1);
case 'Y':
// Ok an expected value, map to X, implicitly sanitized.
buffer[0] = 'Y';
default:
// An unexpected value, fail.
return false;
}
}
// Ok, buffer[0] is no longer tainted; no diagnostic please.
if (buffer[0] == 'X') { ... }
// Oops, buffer[1] is still tainted, I'd like a diagnostic please.
if (buffer[1] == 'X') { ... }
...
}
I don't have a good syntax in mind for TAINT, ALLOW_TAINT_INSPECTION, or
SANITIZE, nor do I think those are necessarily the right primitives.
I strongly suspect that there are excellent papers available on the
subject of taint analysis, but I'm a novice on the subject and do not
currently have any to recommend. I suspect I have colleagues that do
though, so I'll ask around.
Tom.
> On Sun, 13 Jun 2021 at 16:07, Tom Honermann via Std-Proposals
> <std-proposals_at_[hidden]> wrote:
>> Compilers could warn on use of a poisoned value subject to their analysis limitations. Consider this example:
>>
>> void f(bool b1, bool b2) {
>> int *x = POISON(nullptr);
>> if (b1) {
>> x = find_x();
>> }
>> if (b2) {
>> use_x(*x);
>> }
>> }
>>
>> A defect is present if f() is called with b1 false and b2 true. A compiler could warn about the existence of an execution path that results in use of the poisoned value, but such warnings will be false positives (FPs) if f() has a contract that b2 may only be true if b1 is also true. Static analysis tools can do better by examining an entire program to determine if there is a call to f() with b1 false and b2 true, but such analysis is not necessarily conclusive since the values of b1 and b2 may derive from run-time input; in which case, issuing warnings will result in some FPs, and not issuing them will result in some false negatives (FNs). A run-time tool can potentially perfectly identify such defects by tracking undefined/poisoned values and observing their use.
>>
>> I think Edward had it right in his response about desiring potentially distinct behavior for secure, debug, and release compilation modes.
> Intriguing. That makes this kinda like a contract assertion, but
> instead of saying "if we come here, the value of x must be so-and-so",
> it's saying "wherever we go, value of x must be so-and-so", and more
> specifically, "wherever we go, if x is used, its value must
> not be this value". There might be a general notion lurking here.
I think this is a special case of taint analysis and I agree this may be
generalized further.
An example I've had in mind of a more general taint analysis facility
looks something like this:
bool f(FILE *fp) {
char buffer[BUF_SIZE];
size_t s = fread(buffer, 1, BUF_SIZE, fp);
if (s < 2) return false;
// Mark the buffer as tainted until its contents are validated.
TAINT(buffer, 0, s);
// Oops, a tainted value was accessed, I'd like a diagnostic please.
if (buffer[0] == 'X') { ... }
// Now validate the contents.
ALLOW_TAINT_INSPECTION {
// Ok, buffer[0] is tainted, but we're in a sanitization
context; no diagnostics please.
switch (buffer[0]) {
case 'X':
// Ok, an expected value.
SANITIZE(buffer, 0, 1);
case 'Y':
// Ok an expected value, map to X, implicitly sanitized.
buffer[0] = 'Y';
default:
// An unexpected value, fail.
return false;
}
}
// Ok, buffer[0] is no longer tainted; no diagnostic please.
if (buffer[0] == 'X') { ... }
// Oops, buffer[1] is still tainted, I'd like a diagnostic please.
if (buffer[1] == 'X') { ... }
...
}
I don't have a good syntax in mind for TAINT, ALLOW_TAINT_INSPECTION, or
SANITIZE, nor do I think those are necessarily the right primitives.
I strongly suspect that there are excellent papers available on the
subject of taint analysis, but I'm a novice on the subject and do not
currently have any to recommend. I suspect I have colleagues that do
though, so I'll ask around.
Tom.
Received on 2021-06-13 11:02:23