Mira – HPCwirehttps://www.hpcwire.com
Since 1987 - Covering the Fastest Computers in the World and the People Who Run ThemFri, 09 Dec 2016 14:11:16 +0000en-UShourly1https://wordpress.org/?v=4.760365857Mira is First Supercomputer to Simulate Large Hadron Collider Experimentshttps://www.hpcwire.com/2015/11/04/mira-is-first-supercomputer-to-simulate-large-hadron-collider-experiments/?utm_source=rss&utm_medium=rss&utm_campaign=mira-is-first-supercomputer-to-simulate-large-hadron-collider-experiments
https://www.hpcwire.com/2015/11/04/mira-is-first-supercomputer-to-simulate-large-hadron-collider-experiments/#respondWed, 04 Nov 2015 21:54:19 +0000http://www.hpcwire.com/?p=22441Argonne physicists are using Mira to perform simulations of Large Hadron Collider (LHC) experiments with a leadership-class supercomputer for the first time, shedding light on a path forward for interpreting future LHC data. Researchers at the Argonne Leadership Computing Facility (ALCF) helped the team optimize their code for the supercomputer, which has enabled them to […]

]]>Argonne physicists are using Mira to perform simulations of Large Hadron Collider (LHC) experiments with a leadership-class supercomputer for the first time, shedding light on a path forward for interpreting future LHC data. Researchers at the Argonne Leadership Computing Facility (ALCF) helped the team optimize their code for the supercomputer, which has enabled them to simulate billions of particle collisions faster than ever before.

With each collision producing about a megabyte of data, LHC, located on the border of France and Switzerland, generates a colossal amount of data. Even after filtering out about 99 percent of it, scientists are left with around 30 petabytes (or 30 million gigabytes) each year to analyze for a wide range of physics experiments, including studies on the Higgs boson and dark matter.

To help tackle the considerable challenge of interpreting all this data, researchers from the U.S. Department of Energy’s (DOE’s) Argonne National Laboratory are demonstrating the potential of simulating collision events with Mira, a 10-petaflops IBM Blue Gene/Q supercomputer at the Argonne Leadership Computing Facility (ALCF), a DOE Office of Science User Facility.

“Simulating the collisions is critical to helping us understand the response of the particle detectors,” said principal investigator Tom LeCompte, an Argonne physicist and the former physics coordinator for the LHC’s ATLAS experiment, one of four particle detectors at the facility. “Differences between the simulated data and the experimental data can lead us to discover signs of new physics.”

A visualization of a simulated collision event in the ATLAS detector. This simulation, containing a Z boson and five hadronic jets, is an example of an event that is too complex to be simulated in bulk using ordinary PC-based computing grids.

This marks the first time a leadership-class supercomputer has been used to perform massively parallel simulations of LHC collision events. Since 2002, LHC scientists have relied on the Worldwide LHC Computing Grid for all their data processing and simulation needs. Linking thousands of computers and storage systems across 41 countries, this international distributed computing infrastructure allows data to be accessed and analyzed in near real-time by an international community of more than 8,000 physicists collaborating among the four major LHC experiments.

“Grid computing has been very successful for LHC, but there are some limitations on the horizon,” LeCompte said. “One is that some LHC event simulations are so complex that it would take weeks to complete them. Another is that the LHC’s computing needs are set to grow by at least a factor of 10 in the next several years.”

Here is a link to the full unedited article: https://www.alcf.anl.gov/articles/alcf-helps-tackle-large-hadron-collider-s-big-data-challenge

]]>https://www.hpcwire.com/2015/11/04/mira-is-first-supercomputer-to-simulate-large-hadron-collider-experiments/feed/022441Utah University Turns to HPC for Safer Explosives Transporthttps://www.hpcwire.com/2015/01/08/utah-university-turns-hpc-safer-explosives-transport/?utm_source=rss&utm_medium=rss&utm_campaign=utah-university-turns-hpc-safer-explosives-transport
https://www.hpcwire.com/2015/01/08/utah-university-turns-hpc-safer-explosives-transport/#respondThu, 08 Jan 2015 21:39:34 +0000http://www.hpcwire.com/?p=16943In 2005, a semi-truck caught the nation’s attention when it crashed and caught fire, igniting 35,000 pounds of explosives it was carrying through Utah’s Spanish Fork Canyon. Thanks to a brief delay between the truck’s crash and the subsequent explosion, there were no fatalities. But, as evidenced by a number of injuries and a 30-by-70-foot […]

]]>In 2005, a semi-truck caught the nation’s attention when it crashed and caught fire, igniting 35,000 pounds of explosives it was carrying through Utah’s Spanish Fork Canyon.
Photo: Utah Department of Transportation

Thanks to a brief delay between the truck’s crash and the subsequent explosion, there were no fatalities. But, as evidenced by a number of injuries and a 30-by-70-foot crater taken out of the highway, the results can be crippling.

To shed light on the mechanism that caused the chain reaction and help prevent future occurrences, Professor Martin Berzins and his research team from the University of Utah turned to the Argonne Leadership Computing Facility’s 10-petaflop IBM Blue Gene/Q system, Mira. Their research was the subject of a feature article by Jim Collins on the ALCF website.

In the case of the Utah highway incident, the 8,400 cylinders of explosives in transport should have burned away more slowly in an accidental fire, through a process called deflagration. Instead the cylinders detonated, combusting at supersonic speeds and generating a shockwave that blew out the windows of nearby cars.

The research project is using INCITE funding to recreate the detonation virtually. Getting the simulation to reach the desired state has proved particularly challenging due to the incorporation of multiple spatial and temporal scales, but the team’s perseverance has paid off.

“We set out to simulate one-eighth of the actual semi-truck with the explosives in their original packing configuration, but it was not an easy feat,” says Jacqueline Beckvermit, a PhD student at the University of Utah. “After two years of work and more than 100 million computing hours, we finally reached detonation this fall.”

Based on current simulations, the team has identified two possible scenarios that could have led to the explosion: one involving a high-pressure environment caused by trapped gases from the cylinders, and a similar high-pressure scenario caused by the impact of exploding cylinders.

Optimizing and scaling their Uintah Computational Framework to harness a large number of Mira’s cores was key to the group’s success and plans are in place to scale even higher in the future.

Berzins says their ultimate goal will be to enable strategies to prevent similar accidents from occurring in the future.

]]>As the global energy economy makes the transition from fossil fuels toward cleaner alternatives, fusion becomes an attractive potential solution for satisfying the growing needs. Fusion energy, which is the power source for the sun, can be generated on earth, for example, in magnetically-confined laboratory plasma experiments (called “tokamaks”) when the isotopes of hydrogen (e.g., deuterium and tritium) combine to produce an energetic helium “alpha” particle and a fast neutron – with an overall energy multiplication factor of 450:1.

Building the scientific foundations needed to develop fusion power demands high-physics-fidelity predictive simulation capability for magnetically-confined fusion energy (MFE) plasmas. To do so in a timely way requires utilizing the power of modern supercomputers to simulate the complex dynamics governing MFE systems — including ITER, a multi-billion dollar international burning plasma experiment supported by 7 governments representing over half of the world’s population.

Unavoidable spatial variations in such systems produce microturbulence which can significantly increase the transport rate of heat, particles, and momentum across the confining magnetic field in tokamak devices. Since the balance between these energy losses and the self-heating rates of the actual fusion reactions will ultimately determine the size and cost of an actual fusion reactor, understanding and possibly controlling the underlying physical processes is key to achieving the efficiency needed to help ensure the practicality of future fusion reactors.

The goal here is to gain new physics insights on MFE confinement scaling by making effective use of powerful world-class supercomputing systems such as the IBM Blue-Gene-Q “Mira” at the Argonne Leadership Class Facility (ALCF). Associated knowledge gained addresses the key question of how turbulent transport and associated confinement characteristics scale from present generation devices to the much larger ITER-scale plasmas. This involves the development of modern software capable of using leadership class supercomputers to carry out reliable firstprinciples-based simulations of multi-scale tokamak plasmas. The fusion physics challenge here is that the key decade-long MFE estimates of confinement scaling with device size (the so-called “Bohm to Gyro-Bohm” “rollover” trend caused by the ion temperature gradient instability) demands much higher resolution to be realistic/reliable. Our important new fusion physics finding is that this “rollover” is much more gradual than established earlier in far lower resolution, shorter duration studies with magnitude of transport now reduced by a factor of two.

The basic particle method has long been a well established approach that simulates the behavior of charged particles interacting with each other through pair-wise electromagnetic forces. At each time step, the particle properties are updated according to these calculated forces. For applications on powerful modern supercomputers with deep cache hierarchy, a pure particle method is very efficient with respect to locality and arithmetic intensity (compute bound). Unfortunately, the O(N2 ) complexity makes a particle method impractical for plasma simulations using millions of particles per process. Rather than calculating O(N2) forces, the particle-in-cell (PIC) method, which was introduced by J. Dawson and N. Birdsall in 1968, employs a grid as the media to calculate the long range electromagnetic forces. This reduces the complexity from O(N2) to O(N+MlogM), where M is the number of grid points and is usually much smaller than N. Specifically, the PIC simulations are being carried out using “macro” particles (~103 times the radius of a real charged ion particle) with characteristic properties, including position, velocity and weight. However, achieving high parallel and architectural efficiency is very challenging for a PIC method due to potential fine-grained data hazards, irregular data access, and low arithmetic intensity. The issue gets more severe as the HPC community moves into the future to address even more radical changes in computer architectures as the multicore and manycore revolution progresses.

Machines such as the IBM BG/Q Mira demand at least 49,152-way MPI parallelism and up to 3 million-way thread-level parallelism in order to fully utilize the system. While distributing particles to at least 49,152 processes is straightforward, the distribution of a 3D torus-shape grid among those processes is non-trivial. For example, first consider the 3D torus as being decomposed into sub-domains of uniform volume. In a circular geometry, the sub-domains close to the edge of the system will contain more grid points than the core. This leads to potential load imbalance issues for the associated grid-based work.

Through a close collaboration with the Future Technologies Group at the Lawrence Berkeley National Laboratory, we have developed and optimized a new version of the Gyrokinetic Toroidal Code (“GTC-Princeton” or “GTC-P”) to address the challenges in the PIC method for leadership-class systems in the multicore/manycore regime. GTC-P includes multiple levels of parallelism, a 2D domain decomposition, a particle decomposition, and a loop level parallelism implemented with OpenMP – all of which help enable this state-of-the-art PIC code to efficiently scale to the full capability of the largest extreme scale HPC systems currently available. Special attention has been paid to the load imbalance issue associated with domain decomposition. To improve single node performance, we select a “structure-of-arrays” (SOA) data layout for particle data, align memory allocation to facilitate SIMD intrinsic, binning particles to improve locality, and use loop fusion to improve arithmetic intensity. We also manually flatten irregular nested loop to expose more parallelization to OpenMP threads. GTC-P features a two-dimensional topology for point-to-point communication. On the IBM BG/Q system with 5D torus network, we have optimized communication with customized process mapping. Data parallelism is also being continuously exploited through SIMD intrinsics (e.g., QPX intrinsics on IBM BG/Q) and by improving data movement through software pre-fetching.

Simulations of confinement physics for large-scale MFE plasmas have been carried out for the first time with very high phase-space resolution and long temporal duration to deliver important new scientific insights. This was enabled by the new “GTC-P” code which was developed to use multi-petascale capabilities on world-class systems such as the IBM BG-Q “Mira” @ ALCF and also “Sequoia” @ LLNL. (Accomplishments are summarized in the two figures below.)

Figure 2: Important new scientific discoveries enabled by harnessing modern supercomputing capabilities at extreme scale

The success of these projects were greatly facilitated by the fact that true interdisciplinary collaborative effort with Computer Science and Applied Math scientists have produced modern C and CUDA versions of the key HPC code (originally written — as in the case of the vast majority of codes in the FES application domain) in Fortran-90. The demonstrated capability to run at scale on the largest open-science IBM BG-Q system (“Mira” at the ALCF) opened the door to obtain access to NNSA’s “Sequoia” system at LLNL – which then produced the outstanding results shown on Figure 1. More recently, excellent performance of the GPU-version of GTC-P has been demonstrated on the “Titan” system at the Oak Ridge Leadership Class Facility (OLCF). Finally, the G8-sponsored international R&D advances have enabled this project to gain collaborative access to a number of the top international supercomputing facilities — including the Fujitsu K Computer, Japan’s #1 supercomputer. In addition, these highly visible accomplishments have very recently enabled this project to begin collaborative applications on China’s new Tianhe-2 (TH-2) Intel-MIC-based system – the #1 supercomputing system worldwide.

]]>https://www.hpcwire.com/2013/11/16/sc13-research-highlight-extreme-scale-plasma-turbulence-simulation/feed/01670Directing Mirahttps://www.hpcwire.com/2013/10/30/directing-mira/?utm_source=rss&utm_medium=rss&utm_campaign=directing-mira
https://www.hpcwire.com/2013/10/30/directing-mira/#respondWed, 30 Oct 2013 17:59:03 +0000http://www.hpcwire.com/?p=1059Susan Coghlan, Deputy Division Director at the Argonne Leadership Computing Facility, opens up about her role working with Mira, the fifth-fastest supercomputer in the world.

]]>In the latest of an interesting series of Q&As on Argonne National Laboratory’s website, Susan Coghlan, Deputy Division Director at the Argonne Leadership Computing Facility, opens up about her role working with Mira, the fifth-fastest supercomputer in the world.

In addition to her position as Deputy Division Director at the Argonne Leadership Computing Facility (ALCF), Coghlan is also the project director for the large supercomputing systems that are essential to Argonne’s mission. Overseeing these multimillion-dollar systems installations has in many ways become her primary role and as such she has a unique insight into what it takes to go from planning stage to implementation for these leadership-class systems.

With Mira, which occupies the fifth position on the TOP500 list of fastest computers, now in full production mode and open for scientific research, Coghlan addresses what it took to get to this point.

“Typically, it takes about five years, from start to finish, to complete a project like this – to go from preparing a budget and figuring out what type of system is possible to the delivery and deployment of the system,” she states. “So, we actually started the planning and documentation process for Mira in 2008. In this case, we worked very closely with our vendor, IBM, and also had a close collaborative R&D relationship with Lawrence Livermore National Laboratory and IBM. I was responsible for making sure we had the right people and appropriate experts at Argonne reviewing the designs, providing the feedback to IBM, approving design choices and looking at applications and their performance on the different designs as they moved forward.”

Coghlan continues: “The delivery of the first racks for Mira started in April 2012, and that process of delivery, then build-out, went on until July 2012. After delivery, the system moved into the acceptance preparation phase, and finally through the acceptance tests. Mira was accepted in December 2012. Initially, the research being run on Mira was part of the Early Science Program (ESP), and now that Mira is fully online, it is open to all research, including work being done that is part of the Innovative & Novel Computational Impact on Theory and Experiment (INCITE) program.”

Now Mira is up and running and tackling some of the grand challenges of our time with a focus on sustainable energy, a healthy environment and the security of the country. Current projects include developing better batteries, wind turbine advances and studies into water and earthquake research.

]]>https://www.hpcwire.com/2013/10/30/directing-mira/feed/01059Senator Says U.S. Congress Doesn’t ‘Get’ Supercomputershttps://www.hpcwire.com/2013/07/18/senator_says_u-s-_congress_doesn_t_get_supercomputers/?utm_source=rss&utm_medium=rss&utm_campaign=senator_says_u-s-_congress_doesn_t_get_supercomputers
https://www.hpcwire.com/2013/07/18/senator_says_u-s-_congress_doesn_t_get_supercomputers/#respondThu, 18 Jul 2013 07:00:00 +0000http://www.hpcwire.com/2013/07/18/senator_says_u-s-_congress_doesn_t_get_supercomputers/Supercomputers are clearly important to the ability of the U.S. to compete on the global stage, but some members of Congress don't understand that, Illinois Senator Dick Durbin said at the recent dedication of Mira, the IBM BlueGene/Q supercomputer installed last year at the Argonne National Laboratory in Illinois.

]]>Supercomputers are clearly important to the ability of the U.S. to compete on the global stage, but some members of Congress don’t understand that, Illinois Senator Dick Durbin said at the recent dedication of Mira, the IBM BlueGene/Q supercomputer installed last year at the Argonne National Laboratory in Illinois.

”They know the cost [of supercomputers] but they don’t know the value,” Sen. Durbin said in a recent address during the July 1 dedication ceremony. “We really need to educate members of Congress. This supercomputing competition is really key to America’s competitiveness, and to a lot of breakthroughs that will benefit the whole world.”

The U.S still spends more than any other nation on HPC as a whole, but is at risk of falling behind in the race to develop the first exascale system. China, in particular, has been strong in the HPC field, particularly the high end of the market, where the country already outspends the U.S. on development of massive supercomputers.

China’s commitment became clear last month, when Tianhe-2, a 3.1-million core supercomputer rated at 33.8 petaflops of continuous performance was named the world’s fastest supercomputer by the Top 500 organization. It is the second time that a Chinese supercomputer has been named the world’s fastest.

It’s great that the Chinese are becoming more competitive, Argonne National Lab Director Eric Isaacs said at the dedication event. “But it’s also a real threat,” he said in the VOA video. “We’re seeing China more often taking that lead role of … having the fastest computer in the world.”

Many in the HPC community have criticized Congress and other leaders for not dedicating enough resources for the U.S. to lead the world in exascale computing. While the U.S. still develops the best technology, it lacks the financial commitment and the human leadership necessary to keep pace with the exascale efforts of other nations, the Exascale Report recently said.

The prosperity of the U.S. is tied in part to supercomputing, Sen. Durbin said. “There’s a competition in this world not just for jobs for but basic research that can be applied to the private sector and the public sector, and of course the world of supercomputing is where many of those battles are being fought,” he said in the video.

With 8.5 petaflops of continuous performance, Mira is the fifth fastest supercomputer in the world. The 786,000-core cluster is run on behalf of the United States Department of Energy, with funding partially coming from the National Science Foundation. The system will be used for scientific research, including material science, climatology, seismology, and computational chemistry.

While it dropped from third place to fifth place on the Top 500 in the last year, Mira still leads the world of supercomputers in one category: energy efficiency. The system consumes just 3.9 megawatts of electricity, thanks in large part to a liquid cooling system. Instead of using fans, Mira’s processor nodes dissipate heat using chilled water that flows through copper tubes.

]]>The Weekly Top Five features the five biggest HPC stories of the week, condensed for your reading pleasure. This week, we cover Argonne’s new 10-petaflop supercomputer, big rig aerodynamics, Austria’s new 150-teraflop supercomputer, Whamcloud’s partnership with Bull, and Bright Computing’s deal with Dell.

10 Petaflop IBM Supercomputer on Order at Argonne Lab

IBM has been selected to build a 10-petaflop supercomputer for Argonne National Laboratory. The advanced computing machine, named “Mira,” will boost researchers’ ability to tackle advanced scientific challenges with real-world significance. Projects at the top of Mira’s to-do list include designing ultra-efficient batteries for the next generation of electric cars, modeling detailed climate change scenarios and simulating the beginnings of the universe.

The IBM Blue Gene/Q supercomputer sports more than 750,000 cores and uses advanced chip designs and energy-efficient water cooling. With 10 petaflops of processing power, Mira is 20 times faster than Argonne’s current supercomputer, Intrepid.

According to Rick Stevens, associate laboratory director for computing, environment and life sciences at Argonne National Laboratory, the new supercomputer “will help address the critical demand for complex modeling and simulation capabilities, which are essential to improving our economic prosperity and global competitiveness.”

Argonne sees Mira as a stepping stone to exascale computing, machines that will be 100 times more powerful (than Mira). According to the release, “Mira will offer an opportunity for scientists to become more familiar with the capabilities an exascale machine will offer and the programming changes it will require. For example, scientists will have to scale their current computer codes to more than 750,000 individual computing cores, providing them preliminary experience on how scalability might be achieved on an exascale-class system with 100s of millions of cores.”

When Mira comes online in 2012, scientists around the world, from industry, academia and government research labs, will be given the opportunity to apply for time through the DOE’s Innovative and Novel Computational Impact on Theory and Experiment (INCITE) and the ASCR Leadership Computing Challenge (ALCC) programs.

Oak Ridge National Laboratory’s Jaguar supercomputer is being used to improve the aerodynamics of long haul tractor trailers, and in the process save billions of gallons of fuel each year. South Carolina-based BMI Corp. partnered with ORNL researchers to develop the SmartTruck UnderTray System, “a set of integrated aerodynamic fairings that improve the aerodynamics of 18-wheeler (Class 8) long-haul trucks.” After installation, the typical big rig can expect to achieve a fuel savings of between 7 and 12 percent.

With Jaguar, the time it took BMI to process its complex models was reduced from days to hours, and the system’s advanced simulation capabilities obviated the need for time-consuming and costly physical prototypes. The time-savings meant the company was able to finish the project two years ahead of schedule, going from concept to production in 18 months instead of the 3 1/2 years they had originally anticipated.

The technology has significant financial and environmental implications. Mike Henderson, chief executive officer and founder of BMI, noted that if all 1.3 million Class 8 trucks in the US were outfitted with the basic UnderTray package, the average fuel economy would go from 6 mpg up to 6.5 mpg or more, saving the companies 1.5 billion gallons of diesel fuel and $5 billion in fuel costs each year. Better fuel efficiency means fewer CO2 emissions as well, 32.7 billion pounds or 16.4 million tons fewer.

Energy Secretary Steven Chu commented on the success of the collaboration, which was made possible through ORNL’s Industrial High-Performance Computing Partnerships Program:

“The Department of Energy’s supercomputers provide an enormous competitive advantage for the United States. This is a great example of how investments in innovation can help lead the way to new jobs, new ways of cutting our carbon emissions and new opportunities for America to succeed in the global marketplace.”

Austrian Universities to Share New Supercomputer

MEGWARE, a leading German IT company, has been selected to build Austria’s fastest supercomputer at a cost of EUR 4.2 million. The computer will be used by researchers at the Vienna University of Technology, the University of Vienna and the University for Soil Management. A Europe-wide tender process called for a “first-class energy efficiency of the entire system and an enormously high raw data throughput.” The universities chose MEGWARE based on the merits of the company’s design and its comittment to using energy-efficient technology.

The 150-teraflop supercomputer, which will be known as “Vienna Scientific Cluster 2” (VSC-2), will provide Vienna scientists with the power to make important scientific advances. VSC-2 includes more than 1,300 MAGWARE-designed servers, each of which is equipped with two AMD Opteron Magny Cours 6132HE processors, providing the machine with 21,000-plus processsor cores.

The new supercomputer will have a signicant lead on its predecessor, VSC. With 150-teraflops, the new system will be five times more powerful. VSC-2 is also more energy-efficient, employing state-of-the-art water cooling technology to reduce power demands.

MEGWARE representative Jörg Heydemüller expressed satisfaction with the project: “The fact that this significant IT order was awarded to a company from Chemnitz is very important for the location. It shows again the historic links existing between the Free State of Saxony and the microelectronic industry, as the project is implemented with the exclusive use of processors made by AMD.”

Bull, Whamcloud Share Lustre Dreams

Whamcloud’s busy Lustre devotees have joined with European IT vendor Bull to advance the Lustre cause. The duo have formed a strategic partnership towards developing open-source file system and contributing the improvements back to the community. According to Whamcloud, the company’s shared goal is to prepare Lustre to become a file system of choice for exascale systems.

Eric Monchalin, HPC software director at Bull, comments on the company’s involvement, “Bull has been integrating Lustre in its Extreme Computing solutions for many years, and our experts are regular contributors to the evolution of the Lustre file system. Whamcloud has gathered some of the top talents of the Lustre ecosystem. This combination of Bull and Whamcloud skills will allow our customers to continue to get the best performance out of Lustre, while being assured that the file system they chose is fully maintained and supported.”

Whamcloud CEO Brent Gorda expressed excitement about the opportunity to partner with Bull, and looks forward to “driving IO and storage development worldwide in an open-source, hardware-agnostic environment.”

Bright Cluster Joins Dell’s HPC Family

Bright Computing announced this week that its cluster management software, Bright Cluster Manager, would now be an option on Dell’s High Performance Computing Clusters (HPCC). There are already more than 50 Dell installations around the world using Bright Cluster Manager, including three TOP500 sites. Other sites include a number of universities, businesses, and government research centers.

CD-adapco uses the combined solution to run its engineering applications, and IT director Philip Jones is pleased with the product’s performance: “Bright Cluster Manager is a comprehensive cluster management solution that provides all the functionality that we need here at CD-adapco. Our key applications — STAR-CCM+ and STAR-CD — were easy to install and run well on the cluster. Bright Cluster Manager has many features that make it easy for us to manage the cluster and allow us to focus on running our CFD and CAE applications. For example, the image based provisioning makes it very easy to tailor software images and propagate changes to the compute nodes.”