Impact of False Positives on Breach Detection System Accuracy

Impact of False Positives on Breach Detection System Accuracy

False positives, those alarming notifications that turn out to be nothing at all, might initially seem like minor inconveniences, but they dramatically reduce the accuracy of security tools and create huge impediments for security analysts. Unable to remove the noise generated by numerous false positives, it’s extremely difficult for security staff to determine and set correct breach response priorities.

Small Differences in False Positive Rates Make a Huge Difference

A “false positive” is defined as a test result that incorrectly indicates that a particular condition or attribute is present. For example, if you’re having a medical test, a false positive would indicate that you have a particular condition when indeed you don’t. In security, it typically is the result of identifying something (e.g. a webpage, document, or network activity) as malicious when indeed it is benign.

A security product’s false positive rate – the average percent of alerts that are benign – is an important metric in determining how accurately the product identifies real threats. Most threat detection products generate some degree of false positives. However, some tools will produce a great deal more false positives than others. It’s important for security administrators to understand that even a small difference in a product’s false positive rate will make a very large difference in the product’s ability to correctly detect threats.

Why False Positives Are So Significant

Before we drill deeper into how false positives impact cybersecurity, let’s use a medical analogy to help understand why the effects of false positives are so significant.

Imagine that there’s a medical screening test designed to scan a large population for a serious disease. One percent of the population actually has the disease, and doctors use the test to determine who has the disease and who doesn’t.

The test has a 10% false negative rate. That means for those individuals that really do have the disease, the test says that they do not 10% of the time.

The test has a 5% false positive rate. That means for those that do not have the disease, the test says they do have the disease 5% of the time.

On the surface, it seems that our test is quite useful. However, the problem is that a large percentage of the test results are actually incorrect. Some people will receive wrong results—telling them that they don’t have the disease when they really do (false negatives) and the test will tell others that they do have the disease when they really don’t (false positives). But how accurate is our test really?

The real question is the following. Let’s say your doctor gives you the test and the results come back positive—suggesting you have the disease. Should you be worried? What are the chances that you actually have the disease when the test says you do? This question can only be answered if you know what the test’s false positive rate is.

Since such tests are assumed to be credible, without more information one might reasonably believe there’s a 75% or higher chance that the test is correct and you have the disease. Even a skeptic might rationally fear a fifty-fifty chance of having the disease. Although these seem like reasonable guesses, both are quite wrong.

Let’s take a look at the actual accuracy and what the numbers mean. The simplest method to determine the validity of the test is to imagine a large group of people and calculate the percentages of that group that actually do or do not have the disease based on the test. For our test scenario, let’s look at a thousand people.

Of the 1,000 people, only 10 really have the disease (1% of 1,000 is 10).

The test is 90% correct for people who have the disease, so it will get 9 of those 10 correct, and report that 9 people have the disease.

But 990 people do not have the disease. Unfortunately, because of the test’s false positive rate, it will say that 5% or 49 of them do have the disease even though they don’t (5% of 990 is 49).

So, out of 1,000 people, the test will say that 58 people have the disease, even though only 10 of them really do (9 plus 49 = 58).

Of the 58 people who are told they have the disease, only 9 of them actually do. So, how accurate are the test results? For anyone who the test indicated had the disease, it is only about 15% likely that they really do.

9 / 58 = about 15%

Why are the odds of the test being correct so small even though the false positive rate seems relatively low? It’s because the odds of actually having the disease is so low that those who actually have it are greatly outnumbered by those with a false positive.

False positives have a dramatic impact on the accuracy of the test, significantly impacting the outcome and whether the detection is correct or not. For instance, if in the above scenario the false positive rate is improved from 5% to 3%, the test would show 39 people had the disease instead of 58. If the false positive rate were improved to 1%, the test would identify 19 people as having the disease, 9 of which actually do and 10 of which are the false positives, improving the overall accuracy of the test to about 50-50.

Impact of False Positives on Breach Detection System Accuracy

Let’s shift our discussion of false positives from medical tests to how false positives impact cybersecurity and breach detection systems. The principles are exactly the same.

As one would expect, the lower the false positive rate for breach detection systems the better. And as in the medical world, small differences in false positive rates make a huge difference in a product’s ability to accurately detect a data breach. Let’s look at some actual breach detection systems, and what their false positive rates really mean.

The false positive rates presented in the following table are the actual values calculated by NSS Labs for 5 leading data breach detection systems in their 2016 Breach Detection Systems Group Test. (Download the 2017 report.)

Assuming that one in a thousand events are actually malicious, the table above shows the impact that different false positive rates have on the validity of an alert generated by the system. Because the vast majority of objects tested are harmless, even though a breach detection system may have a relatively low false positive rate of say 1%, the alerts generated by such a system are incorrect in over 90% of the cases.

For example, the third row of the table shows the accuracy of alerts for a system with a false positive rate of just .99%. This means that of the millions of objects the system evaluates, it will indicate that 1% of them might be dangerous when they are actually not dangerous at all. The effect is shown in the second and third columns. Only 9.1% of the alerts generated are correct, and 90.9% are incorrect.

The math looks like this:

If we start with 10,000 events or objects, and 1 in a 1,000 of them are malicious, then 10 of the 10,000 are malicious (10,000 * .001 = 10). That means 9,990 of the objects are benign.

Of the 10 actual malicious objects, with a false negative rate of 10%, the system will detect 90% or 9 of them (10 *.9 = 9).

Of the 9,990 objects that are benign, a system with a false positive rate of 1% will indicate that 100 of them are malicious, even when they are not (9,990 * .01 = 100).

In this scenario, the system will report a total of 109 objects as malicious (9 + 100), when in reality there are only 10.

The system is correct only about 9% of the time (9 / 100 = 9.1%). That means it is incorrect 90.9% of the time.

As the table shows, unless the false positive rate is virtually zero, most of the information generated by the system is invalid. This causes security managers and analysts to engage their incidence response and SOC teams to squander valuable time hunting down these ghosts. Because of the potential for damage they have to investigate these alerts. Unfortunately, having done so they will find that there is nothing there—having wasted minutes, hours, or even days.

Low False Positives Enable Your Security Team to be Effective

Since false positives directly and dramatically impact the effectiveness of your security team, it’s critical that organizations really understand the false positive rates of each security product they implement.

Just a small number of false positives will create a lot more unproductive and distracting work for your analysts than one might initially think.

Currently on leave from his position as Professor of Computer Science at UC Santa Barbara, Christopher Kruegel’s research interests focus on computer and communications security, with an emphasis on malware analysis and detection, web security, and intrusion detection. Christopher previously served on the faculty of the Technical University Vienna, Austria. He has published more than 100 peer-reviewed papers in top computer security conferences and has been the recipient of the NSF CAREER Award, MIT Technology Review TR35 Award for young innovators, IBM Faculty Award, and several best paper awards. He regularly serves on program committees of leading computer security conferences. Christopher was the Program Committee Chair of the Usenix Workshop on Large Scale Exploits and Emergent Threats (LEET, 2011), the International Symposium on Recent Advances in Intrusion Detection (RAID, 2007), and the ACM Workshop on Recurring Malcode (WORM, 2007). He was also the head of a working group that advised the European Commission (EC) on defenses to mitigate future threats against the Internet and Europe's cyber-infrastructure.