SANS ISC InfoSec Forums

As part of most vulnerability assessments and penetration tests against a website, we almost always run some kind of scanner. Burp (commercial) and ZAP (free from OWASP) are two commonly used scanners. Once you've done a few website assessments, you start to get a feel for what pages and fields are "likely candidates" for exploit. But especially if it's a vulnerability assessment, where you're trying to cover as many issues as possible (and exploits might even be out of scope), it's always a safe bet to run a scanner to see what other issues might be in play.

All too often, we see people take these results as-is, and submit them as the actual report. The HUGE problem with this is false positives and false negatives.

False negatives are issues that are real, but are not be found by your scanner. For instance, Burp and ZAP aren't the best tools for pointing a big red arrow at software version issues - for instance vulnerability versions of Wordpress or Wordpress plugins. You might want to use WPSCAN for something like that. Or if you go to the login page, a "view source" will often give you what you need.

Issues with the certificates will also go unnoticed by a dedicated web scanner - NIKTO or WIKTO are good choices for that. Or better yet, you can use openssl to pull the raw cert, or just view it in your browser.

(If you're noticing that much of what the cool tools will do is possible with some judicious use of your browser, that's exactly what I'm pointing out!)

NMAP is another great tool to use for catching what a web scanner might miss. For instance, if you've got a Struts admin page or Hypervisor login on the same IP as your target website, but on a different port than the website, NMAP is the go-to tool. Similarly, lots of basic site assessment can be done with the NMAP --version parameters, and the NSE scripts bundled with NMAP are a treasure trove as well! (Check out Manuel's excellent series on NMAP scripts).

False positives are just as bad - where the tool indicates a vulnerability where there is none. If you include blatant false positives in your report, you'll find that the entire report will end up in the trash can, along with your reputation with that client! A few false positives that I commonly see are "SQL Injection" and "OS Commmand Injection".

SQL Injection is a vulnerability where, from the web interface, you can interact with and get information from a SQL database that's "behind" the website, often dumping entire tables.

Website assessment tools ( Burp in this case, but many other tools use similar methods) commonly tests for SQL Injection by injecting a SQL "waitfor delay '0:0:20'" command. If this takes a significantly longer time to complete than the basic statement, then Burp will mark this as "Firm" for certainty. Needless to say, I often see this turn up as a false positive. What you'll find is that Burp generally runs multiple threads (10 by default) during a scan, so can really run up the CPU on a website, especially if the site is mainly parametric (where pages are generated on the fly from database input during a session). Also, if a site's error handling routines take longer than they should, you'll see this get thrown off.

So, how should we test to verify this initial/preliminary finding? First of all, Burp's test isn't half bad on a lot of sites. Testing Burp's injection with curl or a browser after the scanning is complete will sometimes show that the SQL injection is "real". Test with multiple times, so that you can show consistent and appropriate delays for values of 10,30,60, 120 seconds.

If that fails - for instance if they all delay 10 seconds, or maybe no appreciable delay at all, don't despair - SQLMAP tests much more thoroughly, and should be part of your toolkit anyway - try that. Or test manually - after a few websites you'll find that testing manually might be quicker than an exhaustive SQLMAP test (though maybe not as "thorough").

If you use multiple methods (and there are a lot of different methods) and still can't verify that SQL injection is in play after that initial scan's finding, quite often this has to go into the "false positives" section of your report.

OS Command Injection - where you can execute unauthorized Operating System commands from the web interface - is another common false positive, and for much the same reason. In this vulnerability, the scanner will often use "ping -c 20 127.0.0.1" or "ping -n 20 127.0.0.1" - in other words, the injected command tells the webserver to ping itself, in this case 20 times. This will in most operating systems create a delay of 20 seconds. As in the SQL injection example, you'll find that tests that depend on predictable delay will often get "thrown off" if they are executed during a busy scan. Running them after the scan (again, using your browser or curl) is often all you need to do to prove these findings as false. Testing other commands, such as pinging or opening an ftp session to a test host on the internet (that is monitoring for such traffic using tcpdump or syslog) is another good "sober second thought" test, but be aware that if the website you are testing has an egress filter applied to it's traffic, a successful injection might not generate the traffic you are hoping for - it'll be blocked at the firewall. If you have out of band access to the site being assessed, creating a test file is another good test.

Other tests can similarly see false positives. For instance, any tests that rely only on service "banner grabs" can be thrown off easily - either by admins putting a false banner in place, or if site updates update packages and services, but don't change that initially installed banner.

Long story short, never never never (never) believe that initial finding that your scanning tool gives you. All of the tools discussed are good tools - they should all be in your toolbox and in many cases should be at the top of your go-to list. Whether the tool is open source or closed, free or very expensive, they will all give you false positives, and every finding needs to be verified as either a true or false positive. In fact, you might not want to believe the results from your second tool either, especially if it's testing the same way. Whenever you can, go back to first principals and verify manually. Or if it's in scope, verify with an actual exploit - there's nothing better than getting a shell to prove that you can get a shell!

For false negatives, you'll also want to have multiple tools and some good manual tests in your arsenal - if your tool misses a vulnerability, you may find that many or all of your tools test for that issue the same way. Often the best way to catch a false negative is to just know how that target service runs, and know how to test for that specific issue manually. If you are new to assessments and penetration tests, false negatives will be much harder to find, and really no matter how good you are you'll never know if you got all of them.

If you need to discuss false positives and negatives with a non-technical audience, going to non-technical tools is a good way to make the point. A hammer is a great tool, but while screws are similar to nails, a hammer isn't always the best way to deal with them.

Please, use our comment form tell us about false positives or false negatives that you've found in vulnerability assessments or penetration tests. Keep in mind that usually these aren't an indicator of a bad tool, they're usually just a case of getting a proper parallax view to get a better look at the situation.

An excellent way to check website certificates and give a sanity check is the Qualys SSL Labs checker. Unfortunately as it is web based it is only suitable for public web facing services, but nevertheless I have found it really useful as a starting point and when standing up services.

Take a look at an example: https://www.ssllabs.com/ssltest/analyze.html?d=isc.sans.edu

Totally agree that a scan does not equal a pentest. However, I usually INCLUDE the raw scan results as an appendix, for the client's information. We often see triple digits' worth of vulnerabilities in the scans, and the pentest does not usually include manually testing every single one of those. We test the key, remotely-exploitable vulnerabilities that an attacker would likely leverage. I'll include language in my report to explain or highlight any of the more significant findings from the scanner that we did not actually test, and include the scan results as an appendix.