Antivirus evolved

Some say antivirus is an outdated technology. What does “antivirus” even mean? For us, antivirus is the most commonly recognized term that means for customers “a product that stops bad programs from infecting my device.” Saying “antivirus” is similar to when you hear a Southerner (like myself) say “Coke” when referring to a carbonated beverage. Or like when my partner, who is from the UK, says it’s time to “Hoover” the house, when he really means to vacuum.

The original connotation of the term “antivirus” has become defunct. Everyone knows it’s not just about viruses anymore—there’s more to it than that. The traditional means of protecting customers by having humans write signatures based on malware they’ve analyzed, essentially the original method of developing antivirus, is, practically speaking—dead.

What's in a name

Windows Defender Antivirus is more than what the name might traditionally imply. When you protect over a billion customers and provide a verdict for around 90 billion potentially malicious encounters each day, traditional antivirus simply doesn’t scale. Today, we have published a new white paper that describes many of those capabilities, and I’ll go through them briefly in this blog post.

Microsoft is in a unique position to deliver protection to customers. The Windows Defender team, has many industry veterans who have deep knowledge of malware, infection vectors, and even the actors and their motivations – the whole kill chain. Aside from that, Microsoft also has a foundational core of data scientists and experts in machine learning. These individuals span across the company. You can find them in Microsoft Research, of course, but check any team like Office or Bing or Family Safety and you will find at least a few, to an army of data scientists, nearby. Data science is a core part of Microsoft’s DNA which, of course, extends to the Windows Defender team where we have been evolving machine learning to protect our customers.

Machine learning, behavioral analysis, and other evolutions

Windows Defender Antivirus has machine learning models on the local client and in our cloud protection system. At the client, we use high-performance, mostly linear models, to detect malware.

Although 97% of malware is detected locally by the client, we send additional data on suspicious signals and files to the cloud protection system. Heuristic detections, behavioral analysis, and client-based machine learning models work together to identify these potential threats and send them to the cloud protection system for its high-power computational capability. Our most intensive machine learning models live in our cloud protection system. These models can apply enormous computing power to machine learning models that could never run efficiently on the client. We have quick, linear models, of course, in addition to more intensive models like Deep Neural Networks. However, to run hundreds of these models simultaneously to report a verdict in milliseconds, you need serious power that you would not want to impose upon a single computer.

Machine learning as a buzzword has become a hot button topic in the antivirus community, so I want to clarify my position here. Machine learning is but one tool of many required to protect customers. The best artisans utilize a collection of tools and know when to choose one over the other to master their craft. In this case, the craft is customer protection.

At Microsoft, we have the luxury of having the efficiency and precision of traditional antivirus and automated, intelligence-based capabilities that use behavioral analysis, heuristics, and machine learning to scale out our human experts.

On any given day, 30% to 40% of customer malware encounters are related to malware seen more than one time in the ecosystem. These types of threats are great candidates for efficient client-based signatures. The rest of encounters, and in fact 96% of the distinct attacks and signals we see, are first seen threats. These are prime candidates for evolved, intelligent features that use behavioral analysis, machine learning models, or other methodologies.

As mentioned above, most of the threats our customers encounter are detected at the client. However, some of our most powerful, most intensive rules, run in our cloud protection system. So, that additional 3% of threats are detected through intensive processing power in a way that doesn’t impact client performance. We let our cloud protection system do the heavy lifting. Our cloud protection system is also connected to the Microsoft Intelligent Security Graph (ISG), which is informed by trillions of signals from billions of sources consisting of inputs we receive across our endpoints, consumer services, commercial services and on-premises technologies. All that uniquely positions us to personalize our protection and identify anomalies which often represent new threats.

This vast framework of protection tools allows us to efficiently scale out our human expertise. For every malicious signal we manually investigate, we provide protection for an additional 4,500 threats and 12,000 customers (on average). That works out to 99.98% of threats detected for the .02% we manually investigate—a pretty decent ratio.

In the protection stack

Of course, Windows Defender Antivirus is just one key component in the fight against malware and other types of threats. Windows 10 includes a stack of security features that complement Windows Defender Antivirus. We’ve recently introduced Windows Defender Advanced Threat Protection (Windows Defender ATP) to the Windows Defender brand family, which can help customers to detect and respond to advanced attacks that might get past your primary defenses. These features combined provide a secure and full-featured suite of solutions to help customers achieve the security profile that today’s modern threat landscape and customer demand.

Figure 2: The Windows Security Protection stack utilizes a mix of traditional and modern technologies to block cybersecurity threats