Robert Morris, a former Cornell graduate student, wrote and released the original Internet worm in 1988. One method of transmission for his worm was a buffer overrun in fingerd. More recent worms spread by buffer overruns include blaster, nimda, slammer, and code red. Buffer overrun exploits typically enable an attacker to execute arbitrary code (i.e., do anything they want if it's a root process). Overruns continue to show up in both open and closed source software.
To create a static buffer overrun detection system for C source code with three goals:
Static methods analyze the source code without running it wheras dynamic methods watch the code as it runs or insert additional instructions into the code. Each strategy has its drawbacks and advantages.
Statically detecting where buffer overruns can occur in C code without any possibility of false positives or false negatives is computationally intractable. Therefore, most static methods attempt to make a safe and/or useful approximation. A safe approximation produces very few (ideally 0, but that is difficult with some languages) false negatives. False positives are common with static methods that are safe. More precise static methods are usually more computationally expensive.
More popular static methods include tools like (s)lint and even
grep. These tools can catch some problems with only a lexical
analysis--for instance, calls to gets
. However,
lint and grep are far from safe. There are many more
sophisticated algorithms out there, but they are beyond the
scope of this document.
Dynamic methods use extra checks at run-time to either prevent overruns from occurring or to detect them after the fact. Usually, the offending program is terminated (and the incident may be logged) for lack of anything better to do. Sometimes, an exception may be thrown. The primary weakness of dynamic methods is their inability to check all possible executions of the program for problems. Dynamic methods only detect problems in the current execution. Java and StackGuard both qualify as dynamic methods for detecting buffer overruns (only StackGuard will help with C code, of course). Languages like Java are safe (in theory), but tools like StackGuard are not foolproof.
The tool is a result of a series of incremental improvements over David Wagner's approach to detecting buffer overruns [3]. The two largest differences are:
A pointer analysis determines where pointers may point. It is an asymmetric binary relation between variables. At present, the tool is using a flow-insensitive and context-insensitive pointer analysis. This means that the pointer analysis treats all possible orderings of statements equally and essentially does not distinguish between different invocations of the same procedure (this is an oversimplification). Why not use something better? We're thinking about it--but it can be computationally expensive.
/* This program produces a false negative without pointer analysis */ main() { int i = 0, *p; int A[10]; p = &i; *p = 20; A[ i ] = 42; /* buffer overrun */ }
/* This program produces a false positive with a flow-insensitive pointer analysis */ main() { int i = 0, j = 20, *p; int A[10]; p = &j; p = &i; A[ *p ] = 42; /* not an overrun, but the tool would flag it */ } /* This program produces a false positive with a context-insensitive pointer analysis */ int *id( int *a ) { return a; } main() { int i = 0, j = 20, *p, *q; int A[10]; q = id(&j); p = id(&i); /* the PA thinks this may return &j or &i */ A[ *p ] = 42; /* not an overrun, but the tool would flag it */ }
/* Nothing is flagged in this program (there is a true negative) */ main() { int i; int A[10]; i = 15; for( i = 0; i < 10; i++ ) A[i] = 42; }
main() { int i,j; int A[10]; for( i = 0, j = 0; i < 10; i++, j++ ) A[j] = 42; /* the tool thinks the value of j is unbounded */ }
int id(int a) { return a; } main() { int A[10]; id(13); A[id(7)] = 42; /* false positive here because the tool thinks id(7) might return 13 */ }
Some other sources of imprecision include dead code, infeasible paths, and non-linear integer computations.
The tool (in various stages of development) has been used to detect about 20 low-risk buffer overruns in wu-ftpd. We have also confirmed that the tool finds known buffer overruns in applications including: sendmail, talkd, CLIPS, and strace.
Note: As far as we know, there are no false negatives in any of the benchmarks. However, it is possible to engineer a false negative in the C language. So there may be false negatives we do not know about.
Benchmark | Flow Sensitivity Enabled? | False Negatives | True Negatives | False Positives | True Positives | True Positive Rate ( True Positives / Positives ) |
wu-ftpd-2.6.2 | Yes, some conditionals | 0? | 1642 | 322 | 23 | 6.67% |
wu-ftpd-2.6.2 | Yes, but no conditionals | 0? | 1629 | 335 | 23 | 6.42% |
wu-ftpd-2.6.2 | No | 0? | 1604 | 360 | 23 | 6.01% |