From Big Data to Smart Data

Digging for Data in a Particle Mine

An intentional collision: Elementary particles collide at almost the speed of light in a particle accelerator at the nuclear research center CERN. The collision produces innumerable particle fragments that scientists hope will provide them with new insights.

Siemens software is helping the Large Hadron Collider near Geneva to reveal nature’s ultimate secrets.

Collisions take place constantly near the French town of Cessy, within sight of Mont Blanc. But the reason is not icy roads or careless drivers. In fact, residents sleep very peacefully at night. Nevertheless, up to 800 million times per second, elementary particles smash into one another at inconceivable speeds some 50 to 100 meters underground. All of this takes place in a 27-kilometer ring tunnel called the Large Hadron Collider (LHC), which is operated by the European nuclear research center CERN and extends below portions of both Switzerland and France.

The collisions are produced when two proton beams crash into one another at four spots in the tunnel that are equipped with detectors. From the hail of fragments that follows such controlled impacts, physicists hope to gain deeper insights into the structure of the universe at the smallest levels. And they have been successful. In 2012, for example, they detected the Higgs boson, which imparts mass to all matter.

The process of modernizing this colossal machine began at the start of 2013. In March 2015, it will start up again, and the particle beams will then collide with twice as much energy. Following the discovery of the Higgs boson, researchers now want to delve even more deeply into the unanswered questions of the universe.

The newly developed analysis software from Siemens can often detect faults within just half an hour.

The Accelerator generates more than 300 Terabytes per Year

The LHC helps physicists extract knowledge from massive quantities of data measured in petabytes (one petabyte is equal to a thousand terabytes). And with approximately 30 million sensors taking its mechanical pulse, the accelerator itself — one of the most complicated automated systems in the world — generates large streams of its own data, more than 100 terabytes per year. And that will climb to over 300 terabytes when it starts up again in 2015. This trove of data is being mined by software engineer Manuel Gonzalez Berges, who oversees the extensive control systems at the LHC, together with colleagues from CERN and Siemens. They are using new, adaptive diagnostic software developed by Siemens for this purpose.

The software operates as follows: it searches for the root causes of malfunctions of all kinds, which can sometimes disable the system for weeks at a time — and it does so much faster than people were capable of previously. “In the past, when a system in the LHC reported a warning, it sometimes took two weeks before an expert found the actual cause of the error,” Gonzalez Berges tells visitors in the LHC control center, amidst dozens of computer screens. “With the newly developed analysis software from Siemens, which we’ve been testing since 2013 on archived operating states, we can now often pinpoint the source of these errors in just half an hour. When the accelerator starts up again in coming days, we want to predict problems in real time during test phases — and then from 2017 onward, we’ll deploy the software on a broad scale during regular operations.”

Interpreting the flood of data streaming out of the LHC is by no means trivial. The LHC is kept up and running by innumerable automated systems that control such things as the vacuum, cooling, and energy supply. Sensors check whether the LHC is maintaining an ultra-high vacuum in the pipe that carries the particle beam, and whether its superconducting magnets are being kept sufficiently chilled. With the help of liquid helium, these magnets, which weigh many tons and keep the beam on course, are kept at the constant temperature of -271.25 degrees Celsius — colder than the temperature of outer space. Computer programs also monitor whether the control systems for the detectors are working properly.

Elementary particles speed toward each other in a 27-kilometer-long ring-shaped tunnel underground. The facility is equipped with innumerable automatic systems such as vacuum and refrigeration units and an energy supply system.

A Subterranean Giant

Gonzalez Berges leads his group of visitors to one of these measuring instruments, the CMS (Compact Muon Solenoid) detector. Together, they take an elevator down to a depth of 80 meters below the surface. Once there, he guides them through safety locks, and past computer servers and equipment cabinets covered with cables, until they eventually reach the front of the 21-meter-long, barrel-shaped detector, which is three stories high.

The detector is structured like an onion. In the middle is a pipe about the thickness of an arm, in which protons collide with one another. Around this, there is an arrangement of shells several meters thick in all. The innermost shell holds silicon detectors that record the tracks of the particle shower. Another layer measures the energy of the particles. This section is made partly of brass recycled from World War II-era Russian navy artillery shells. At the outer edge, finally, there are special chambers with millions of wires to detect high-energy muons. The measured data is subsequently evaluated by seven computers multiple times before they report unusual observations.

As part of its efforts to keep the LHC running smoothly, CERN has installed over 600 Simatic control systems from Siemens over the last ten years. These systems are typically used only in complex industrial facilities. The LHC, however, is in a class by itself. An automotive plant uses only 50 to 100 of these control systems, and an oil platform will use a mere 5 to 20 of them. At the LHC, for example, one of these control units relies on 12,000 sensors to monitor the gas cycle that cools the magnets to just a few degrees above absolute zero. Each component of this process generates status messages and sometimes warnings, most of which are unimportant.

In the future, intelligent diagnostic software from Siemens will help to identify relevant events and their causes in the resulting wave of big data / smart data. One example might be locating a leak when the gas pressure in a pipe falls. When viewed from the outside, however, the relationship is not always apparent, because, as Siemens software expert Mikhail Roshchin points out, “a single malfunction can lead to an avalanche of warnings.”

In the future, the smart diagnostic software from Siemens should help researchers identify relevant events and their causes in the wave of big data / smart data.

Analyses of Concentrated Machine Intelligence

The Siemens software is trained by teaching it to identify patterns in problem situations from the past, and then having it look for these same patterns in new sets of data. This is called root cause analysis. In addition, computer scientists from CERN and Siemens are continuously adding new algorithms to the program manually in order to improve its capability. In this way, scientists are gradually building up a unique body of practical knowledge tailored to the LHC.

In the future, this should enable the program to interpret data quickly in critical situations, even though the data is “usually unreliable, incomplete, and out of sequence,” says CERN computer scientist Filippo Tilaro.

Despite the complexity of the analyses performed by the concentrated machine intelligence involved here, the results must be easy for users to understand. The people who use CERN’s instruments are often scientists who are at the site only briefly and don’t have time to familiarize themselves with complicated software. The diagnostic program therefore has an intuitive user interface and offers specific suggestions for steps to take in the event of a false alarm.

As part of its efforts to keep the LHC running smoothly, CERN has installed over 600 Simatic control systems from Siemens over the last ten years.

The World Wide Web was born at CERN

The use of this program benefits not only CERN but Siemens as well. “The things we learn here will definitely be applied to management software for other industrial facilities too,” says Thomas Hahn, head of software development at Siemens Corporate Technology. And this carries on an old tradition. Research at CERN has often led to technologies that shape our everyday lives. For example, the “World Wide Web” was born here, and the institution has made decisive contributions to the development of computer tomography.

After a little more than 20 minutes, Gonzalez Berges heads back to the elevator. When the accelerator starts running again in March 2015, the stream of visitors in these hallowed halls of physics will practically disappear. At that point, however, the LHC might open new vistas when it operates with twice as much energy, or 13 teraelectronvolts – perhaps uncovering new evidence of dark matter or supersymmetry, which assigns to each of the elementary particles an associated, as-yet undiscovered partner. If any important error messages arise during operation, Siemens’ new diagnostic software will help locate the cause. Better yet, as it comes to understand the LHC, the software is eventually expected to suggest ways in which the machine itself can be better organized — the gas cycle or the power supply, for example. “We’re only starting to make use of the potential this diagnostic software has for us,” says Gonzalez Berges.