Machine Learning Advances Can Strengthen Cyber Defense

Out-of-the box thinking is needed when using machine learning to help secure systems, an expert says

Machine learning has advanced to the point where more sophisticated methods can be more effective at cyber event detection than traditional methods, an expert says. Along with emerging methods, access to large amounts of “fresh” data is important for processing, determining trends and identifying malicious activity.

Teams looking at how to use machine learning need to consider different methods, suggested Mark Russinovich, chief technology officer, Microsoft Azure, at the AFCEA Defensive Cyber Operations Symposium (DCOS) in Baltimore on May 17.

Microsoft Azure is also leveraging intelligence from its cloud service products, Russinovich said. This gives the company the benefit of the massive collection data necessary for applying machine learning algorithms, as well as experience building cyber defense solutions across different domains.

Russinovich noted that there can be “significant challenges” in the phases—from concept to production—of developing machine learning processes for cyber defense. “This is a very iterative process,” he said.

In some cases, traditional machine learning is not enough to detect malicious activity, so the company is using so-called transfer learning. Transfer learning is one type of machine learning, as compared to supervised learning, unsupervised learning or reinforced learning. Russinovich contended that it is currently bringing a lot of success to the industry.

The method divides tasks into source tasks, from which to derive a certain level of knowledge to train the model, and target tasks, to directly apply the machine learning system. Traditional machine learning only looks at different tasks. “This is going to be what drives the richness of machine learning,” he said. “Transfer learning offers the opportunity to leverage these other types of learning.”

In one case study, Microsoft Azure looked at how to detect compromised virtual machines by looking at network traffic. “But in figuring out how to determine that something is suspicious, we found we had no one model to do this,” Russinovich shared. “We could not do a separate machine learning tool for each one. It would not scale.” So the company created a generic machine leaning tool for suspicious behavior, employing a so-called “ensemble tree learning” algorithm. Using transfer learning, they trained the model to correctly vote on a decision tree on benign versus malicious activity.

The key here, Russinovich said, was tagging the data. “You need labels to compare,” he noted.

By labeling the data into groups—such as previous red team activities, third-party threat intelligence feeds, cyber defense operations—and then extracting these data flows from the labels, the company could make a profile of the malicious interactions.

The effectiveness of the approach, with a data set the size of 3 gigabytes per hour, is strong, and the machine learning training only takes a few seconds for the computer to complete, Russinovich said. However, it requires the use of a “fresh” collection of fixed data to determine when something is malicious, he noted.