Thousands of scientists worldwide tap into CERN’s computer networks each day in their quest to better understand the fundamental structure of the universe. Unfortunately, they are not the only ones who want a piece of this vast pool of computing power, which serves the world’s largest particle physics laboratory. The hundreds of thousands of computers in CERN’s grid are also a prime target for hackers who want to hijack those resources to make money or attack other computer systems. But rather than engaging in a perpetual game of hide-and-seek with these cyber intruders via conventional security systems, CERN scientists are turning to artificial intelligence to help them outsmart their online opponents.

Current detection systems typically spot attacks on networks by scanning incoming data for known viruses and other types of malicious code. But these systems are relatively useless against new and unfamiliar threats. Given how quickly malware changes these days, CERN is developing new systems that use machine learning to recognize and report abnormal network traffic to an administrator. For example, a system might learn to flag traffic that requires an uncharacteristically large amount of bandwidth, uses the incorrect procedure when it tries to enter the network (much like using the wrong secret knock on a door) or seeks network access via an unauthorized port (essentially trying to get in through a door that is off-limits).

CERN’s cybersecurity department is training its AI software to learn the difference between normal and dubious behavior on the network, and to then alert staff via phone text, e-mail or computer message of any potential threat. The system could even be automated to shut down suspicious activity on its own, says Andres Gomez, lead author of a paper describing the new cybersecurity framework.

CERN’s Jewel

CERN—the French acronym for the European Organization for Nuclear Research lab, which sits on the Franco-Swiss border—is opting for this new approach to protect a computer grid used by more than 8,000 physicists to quickly access and analyze large volumes of data produced by the Large Hadron Collider (LHC). The LHC’s main job is to collide atomic particles at high-speed so that scientists can study how particles interact. Particle detectors and other scientific instruments within the LHC gather information about these collisions, and CERN makes it available to laboratories and universities worldwide for use in their own research projects.

The LHC is expected to generate a total of about 50 petabytes of data (equal to 15 million high-definition movies) in 2017 alone, and demands more computing power and data storage than CERN itself can provide. In anticipation of that type of growth the laboratory in 2002 created its Worldwide LHC Computing Grid, which connects computers from more than 170 research facilities across more than 40 countries. CERN’s computer network functions somewhat like an electrical grid, which relies on a network of generating stations that create and deliver electricity as needed to a particular community of homes and businesses. In CERN’s case the community consists of research labs that require varying amounts of computing resources, based on the type of work they are doing at any given time.

Grid Guardians

One of the biggest challenges to defending a computer grid is the fact that security cannot interfere with the sharing of processing power and data storage. Scientists from labs in different parts of the world might end up accessing the same computers to do their research if demand on the grid is high or if their projects are similar. CERN also has to worry about whether the computers of the scientists’ connecting into the grid are free of viruses and other malicious software that could enter and spread quickly due to all the sharing. A virus might, for example, allow hackers to take over parts of the grid and use those computers either to generate digital currency known as bitcoins or to launch cyber attacks against other computers. “In normal situations, antivirus programs try to keep intrusions out of a single machine,” Gomez says. “In the grid we have to protect hundreds of thousands of machines that already allow” researchers outside CERN to use a variety of software programs they need for their different experiments. “The magnitude of the data you can collect and the very distributed environment make intrusion detection on [a] grid far more complex,” he says.

Jarno Niemel, a senior security researcher at F-Secure, a company that designs antivirus and computer security systems, says CERN’s use of machine learning to train its network defenses will give the lab much-needed flexibility in protecting its grid, especially when searching for new threats. Still, artificially intelligent intrusion detection is not without risks—and one of the biggest is whether Gomez and his team can develop machine-learning algorithms that can tell the difference between normal and harmful activity on the network without raising a lot of false alarms, Niemel says.

CERN’s AI cybersecurity upgrades are still in the early stages and will be rolled out over time. The first test will be protecting the portion of the grid used by ALICE (A Large Ion Collider Experiment)—a key LHC project to study the collisions of lead nuclei. If tests on ALICE are successful, CERN’s machine learning–based security could then be used to defend parts of the grid used by the institution’s six other detector experiments.

Scientific American is part of Springer Nature, which owns or has commercial relations with thousands of scientific publications (many of them can be found at www.springernature.com/us). Scientific American maintains a strict policy of editorial independence in reporting developments in science to our readers.