Steven M. Christey coley at linus.mitre.org
Thu May 27 18:22:18 EDT 2010

I participated in SATE 2008 and SATE 2009, much more actively in the 2008 
effort.  I'm not completely sure of the 2009 results and final 
publication, as I've been otherwise occupied lately :-/ Looks like a final 
report has been delayed till June (the SATE 2008 report didn't get 
published till July 2009).

For SATE 2008, we did not release final results because the human analysis 
itself had too many false positives - so sometimes we claimed a false 
positive when, in fact, the issue was a true positive.  Given this and 
other data-quality problems (e.g. we only covered ~12% of the more than 
49,000 items), we believed that to release the raw data would make it way 
too easy for people to make completely wrong conclusions about the tools.

> The problems that the data would have revealed is:
> 1) false positive rates from these tools are overwhelming

As covered extensively in the 2008 SATE report (see my section for 
example), there is no clear definition of "false positive" especially when 
it comes to proving that a specific finding is a vulnerability.

For example: suppose you have a report in a function of a buffer overflow. 
To prove the finding is a vulnerability, you have to dig back through all 
the data flow, sometimes going 20 levels deep.  This is not feasible for a 
human evaluator to determine if there's really a vulnerability.  Or, maybe 
the overflow happens when you're reading a configuration file that's only 
under the control of the administrator.  These could be regarded as false 
positives.  However, the finding may be "locally true" - i.e. the function 
itself might not do any validation at all, so *if* it's called 
incorrectly, an overflow will occur.  My suspicion is that a lot of the 
"false positives" people complain about are actually "locally true." 
And, as we saw in SATE 2008 (and 2009 I suspect), sometimes the human 
evaluator is actually wrong, and the finding is correct.  Hopefully we'll 
account for "locally true" in the design of SATE 2010.

> 2) the work load to triage results from ONE of these tools were 
> man-years

This was also covered (albeit estimated) in the 2008 SATE report, both the 
original section and my section.

> 3) by every possible measurement, manual review was more cost effective

There was no consideration of cost in this sense.

One lost opportunity for SATE 2008, however, was in comparing the results 
from the manual-review participants (e.g. Aspect) versus the tools in 
terms of what kinds of problems got reported.  (This also had major 
implications for how to count number of results).  I believe that such a 
focused effort would have shown some differences in what got reported. 
At least, that's in the raw data since it shows who claimed what got 

While the SATE 2008 report is quite long mostly thanks to my excessive 
verbiage, I believe people who read that document will see that SATE has 
been steadily improving its design over the years.  The reality is that 
any study of this type is going to suffer from limited manpower in 
evaluating the results.


> The coverage was limited ONLY to injection and data flow problems that 
> tools have a chance of finding. In fact, the NIST team chose only a 
> small percentage of the automated findings to review, since it would 
> have taken years to review everything due to the massive number of false 
> positives. Get the problem here?

While there were focused efforts in various types of issues, there was 
also random sampling to get some exposure the wide range of problems being 
reported by the tools.  Your critique of SATE with respect to its focus on 
tools versus manual methods is understandable, but SATE (and its parent 
SAMATE project) are really about understanding tools, so this focus should 
not be a surprise.  After all, the first three letters of SATE expand to 
"Static Analysis Tool."

- Steve

Join us on IRC: irc.freenode.net #webappsec

Have a question? Search The Web Security Mailing List Archives: 

Subscribe via RSS: 
http://www.webappsec.org/rss/websecurity.rss [RSS Feed]

Join WASC on LinkedIn

More information about the websecurity mailing list