No "constraints from an outside funding agency" plus speeds in the world's Top 25.

Share this story

On Friday, Indiana University debuted what school officials called the first petaflop supercomputer to be a dedicated university resource. Keeping in line with Hoosier pride (and their previous supercomputer), the new academic resource is called Big Red II.

"It's important that this is a university-owned resource," said the director of science for Argonne Leadership Computing Facility, Paul Messina, according to a Network World piece on the dedication. "Here you have the opportunity to have your own faculty, staff, and students get access with very little difficulty to this wonderful resource."

Big Red II is a Cray-built machine with a max performance ceiling of one petaflop. For a frame of reference, IBM's Roadrunner supercomputer became the first to reach petaflop performance in 2008 and remained the fastest in the world until the end of 2009. It was recently taken offline and set to be dismantled despite still being among the Top 25 fastest supercomputers in the world. The world's fastest supercomputer as of November 2012—Titan at the Oak Ridge National Laboratories—can hit a speed of 17.6 petaflops. A petaflop is one quadrillion floating point operations per second, or a million billion.

An IU release on Big Red II says the system will be used for research on a variety of topics: medicine and physics to fine arts and global climate research. "There are other universities that hold legal title to computers as fast or faster than Big Red II, but IU is the first in the world to have its own one petaflop supercomputer as a dedicated university resource," said Craig Stewart, executive director of the IU Pervasive Technology Institute and associate dean of research technologies. "Big Red II will be used by IU, for IU to support IU's activities in the arts, humanities, and sciences, and to support the economic development of Indiana—without any constraints from an outside funding agency."

Network World went on to describe more of the system's specifics. It contains a total of 21,824 processor cores, 43,648GB of RAM, and 180TB of local storage. Big Red II uses GPU-enabled and standard CPU compute nodes, with the 344 CPU nodes using two 16-core AMD Abu Dhabi processors and the 676 GPU nodes using one 16-core AMD Interlagos and one NVIDIA Kepler K20.

Big Red II replaces Big Red (which debuted in 2006). That supercomputer only reached speeds of 28 teraflops, so the sequel is a significant jump. In his speech, Messina said that this new system should help the university tremendously, attracting both big research dollars and top notch faculty talent.

They're going to run Maya on a petaflop supercomputer?...Are they planning on releasing a bunch of feature-length movies?

I smell university politics at work, bet they had to throw a sop to the arts departments to get them to sign-off on something to do with the project. In return they get to run some relatively trivial task for a few seconds rather than buying a workstation for $20k.

Yeah, humanities grad reporting in here. What would I be using that for? I barely need all the features in Microsoft Word.

Joking aside, there are a lot of tasks in the humanities that need a computer to make practical, but this is still too much in a form that's not very good for us. You can't sit down at a supercomputer and work on Stata.

You have to remember that with most Top 25 supercomputers like this, because there is such massive publicity for the hardware vendors involved owning a slot in the rankings, the hardware is often subsidised with large grants (that don't show up in the news articles) that neutralise the hardware cost.

There is particular evidence for this here because AMD just doesn't have the generalist server performance/watt to justify being used in this if it were a free choice. Performance/watt being the key metric in a large installation likely to be limited by total power supply density. I like AMD but they're a process node down (32nm vs 22nm) and behind on architecture.

The way IU is pitching this is kind of amusing. Both Stampede at UT Austin and Blue Waters at the University of Illinois are indeed primarily NSF systems, but both have substantial portions of time dedicated to own-campus projects. That time is on top of what local faculty win in the nation-wide proposal process, for which they're likely to be well-placed to begin with. Even better, time on those larger systems can be used to solve bigger problems or get results faster, because of their greater capacity.

As for the fine arts stuff, one application I can think of that could use seriously huge computational resources are deep multi-spectral imaging studies of older works. Those generate a lot of data, and the processing to extract interesting results is quite intensive. Being able to load it all into RAM and solve optimization problems over it efficiently is rather helpful.

You have to remember that with most Top 25 supercomputers like this, because there is such massive publicity for the hardware vendors involved owning a slot in the rankings, the hardware is often subsidised with large grants (that don't show up in the news articles) that neutralise the hardware cost.

There is particular evidence for this here because AMD just doesn't have the generalist server performance/watt to justify being used in this if it were a free choice. Performance/watt being the key metric in a large installation likely to be limited by total power supply density. I like AMD but they're a process node down (32nm vs 22nm) and behind on architecture.

Cray may have discounted this hardware a bit relative to list prices, but doubtful for the reasons you state. After all, most of the FLOPS in the system (and any XK7 installation) are from the GPUs, which are quite efficient in power usage.

Rather, Cray has shifted architecture to Intel parts in the subsequent generation XC30 systems. That potentially leaves them with an inventory of the older XE/XK boards that are depreciating rather rapidly. Part of that disadvantage does come from AMD's competitive position vice Intel, but part of it is just that the XC30 network is much nicer, and the overall design ought to end up more reliable, since they've worked on that pretty heavily in each generation.

You know that the deans in all the schools will be looking for some way to use this as a recruiting tool for faculty and grad students. IU has arguably the top music department in the US. There are branches of vocal pedagogy and musicology that push pretty hard into the physics of sound. I could see things like realtime acoustical modeling of large spaces might be something someone wants to get into. I doubt it would need a petaflop, but it wouldn't hurt.

This is always a great thing for a university to have. There are several proof-of-concept things in HPC that are difficult to get the funding for and the processor-hours allocated for. Having an on-site facility means you can investigate those itches.

They're going to run Maya on a petaflop supercomputer?...Are they planning on releasing a bunch of feature-length movies?

I smell university politics at work, bet they had to throw a sop to the arts departments to get them to sign-off on something to do with the project. In return they get to run some relatively trivial task for a few seconds rather than buying a workstation for $20k.

At a lot of colleges, 3D Animation is considered as fine art. There is a lot of research in 3d graphics that require massive computing superpower. Sea waves simulation, for instance. And the result both can be used in engineering (boat, etc) and also film. Just like computer graphics in the old days of vista generation (cloud, planet, fractal landscape). So supercomputer is great.

There is also a Nova Science documentary on analyzing how old master paint (strokes and all that) and how that allow for checking for fakes.

Last but not least, they have to be politically correct, as in making the university resources available to all branches. But if they have nothing to submit, the computing resources will of course used by other sciences.

I doubt there's any incentive to create custom software tailored to the fine arts that can utilize the petaflop resources. However, that doesn't mean that the computer can't have a VM running on it for fine arts usage. I'm sure the above logic applies to many other academic fields as well.

Edit: Actually, I take the first part back. One can usually find a perspective of fine arts that can be tied to a heavy computational task, e.g. pattern recognition of various aspects of collection(s) of art(s).

At a lot of colleges, 3D Animation is considered as fine art. There is a lot of research in 3d graphics that require massive computing superpower. Sea waves simulation, for instance.

Something like this? Sure, that can require a fair amount of computer time. But look at the departments involved: Graduate School of Oceanography, NOAA/NCEP Environmental Modelling Center, Goddard Space Flight Center

Quote:

There is also a Nova Science documentary on analyzing how old master paint (strokes and all that) and how that allow for checking for fakes.

Are you thinking of this? Introduced by an astrophysiscist, featuring professors of mathematics, statistics and artificial intelligence. It did include an artist, her role was to produce a fake painting to test if the scientists could pick it out.

There's certainly a bunch of artists producing work through digital imaging. You can see some examples on the SIGGRAPH exhibit here. But I can't see anything that would stress a reasonably powerful desktop.

Quote:

Last but not least, they have to be politically correct, as in making the university resources available to all branches. But if they have nothing to submit, the computing resources will of course used by other sciences.

Yeah, universities are as rife with politics as anywhere else. Some would say more so. If someone genuinely has a project led by a team from the fine arts that would require petaflop-levels of computer power then I'd like to know about it. But I'm skeptical.

I'm interested in hearing more about the constraints imposed by outside funding sources for supercomputer uses. Are there political angles to it, like prohibitions against climate science, or requirements that a certain percentage of resources be devoted to DoD projects?

A few of my college's alumni held key roles at Pixar. One hundred and fifty years ago, when they released Toy Story, one of them, a physics grad and sometime physics professor who, at the time, was Pixar's chief scientist, wrote an article for the college magazine comparing the amount of computation involved in creating toy story with other large scale computational problems of the era. It was in the same ballpark.

This wasn't a big surprise, because at about the same time, I'd installed and configured a small collection of DEC Alpha systems as a render farm for a small group of 3D artists. While explaining the benefits to one of the artists he chuckled and said, that's nice, but it always takes X minutes to render a frame. I looked at him, puzzled, until he explained that 3D artists basically crammed as much detail into a frame as they could until the render times grew unbearable.

I doubt that has changed much. Much (though not all) computer art involves simulations, and like most simulations, the complexity of the simulation expands to fill available computation.

You may quibble that Toy Story, etc, aren't "fine art" and I'd probably agree with you, but what makes something fine art has little or nothing to do with the tools or the medium. Similarly, the fact that an astrophysicists, mathematicians, and AI professors are involved in a project doesn't mean that it is not also a "Fine Arts" project. Nor does it necessarily mean that the computations would have been eligible to run on a restricted use supercomputer.

This whole line of argument is silly. Huge super computers like this are shared resources. Some jobs might occupy a majority of the resources for days or weeks, but others are resident on a much smaller subset for a much shorter time. So, perhaps jobs from Fine Arts students and faculty may never need a whole pentaflop for weeks, but they sure as hell will be able to use fractional pentaflops for hours. That's good for efficiency, good for science, good for scientists, good for artists and just generally good for the human endeavor.

And here I am at Purdue running VLSI simulations on an Opteron cluster from 2006. Debugging is quite a pain when each simulation run takes 8 hours...

For anyone suffering in conditions like this, apply for NSF machine time. The XSEDE allocation policies are not that strict, and the resources are there explicitly to support ongoing research projects. Some of them, like Kraken, are even occasionally seriously underutilized.

I'm interested in hearing more about the constraints imposed by outside funding sources for supercomputer uses. Are there political angles to it, like prohibitions against climate science, or requirements that a certain percentage of resources be devoted to DoD projects?

One has to apply for access during, write up a reasonably substantial proposal, and wait for it to be approved. Then you need to set up accounts, which can add a week or two for some sites (or longer for international students), and keep track of usage to make sure you don't burn out your allocation before your project is wrapped up.

In contrast, a campus resource is often allocated in large part on a fair-share scheduling policy, in which various units buy into the resources and can expect that their jobs will run at least a corresponding fraction of the time in aggregate. Thus, no need for proposals, accounts are generally tied to common campus IT resources, and there's no accounting for the time used. Until your demands become greater than the campus resource can provide, it's a much easier way to work.

The University of Illinois, home to Blue Waters, also has a much smaller campus cluster program running along those lines that's quite popular. Its first instance has a peak of ~130 teraflops, and the second instance being built now will add an additional 170 teraflops, or possibly a bunch more if a lot if users decide to fund a bunch of GPU nodes. We just aren't crowing about it, because we know that it's not really news. I expect a lot of other universities with substantial science and engineering programs are doing the same sort of thing.

Hmmmm. Is it common for supercomputers to have such a high ratio of RAM to storage (~23%)?

Either way, it seems I'm behind the specifications curve. Maybe the dealmaster will grace us with a deal on 500GB bundles of DDR3 this week....... /really_bad_joke

edit: type or -> of

The 180TB is probably just the locally attached storage, i.e. what is connected directly to the compute nodes.

A common usage scenario is to have a small amount of very fast storage attached directly to the compute nodes which is intended to be used by the currently running calculations only. Once the calculations are done the computed data will be copied to a dedicated fileserver which will have much larger amount of slower, cheaper storage.

On the cupercomuter I use files on the local storage are automatically deleted after 28 days of inactivity to prevent people from using the local storage as long term storage.

Hmmmm. Is it common for supercomputers to have such a high ratio of RAM to storage (~23%)?

At IU too there is a high speed and longer-term storage system of over 5 Petabytes that people can use and access from any university system. It's called 'Data Capacitor'

/me works at IU...

Awesome, thanks for the answer! That makes much more sense to me, I had a hard time believing that all of the calculations could fit in 180T. Also, I love the term "Data Capacitor".

A little number crunching shows that, compared to my home rig, you guys still have about 5 times as much RAM as I do compared to long-term storage. But that is way closer to my expectations than the initial comparison (~200x).

And here I am at Purdue running VLSI simulations on an Opteron cluster from 2006. Debugging is quite a pain when each simulation run takes 8 hours...

For anyone suffering in conditions like this, apply for NSF machine time. The XSEDE allocation policies are not that strict, and the resources are there explicitly to support ongoing research projects. Some of them, like Kraken, are even occasionally seriously underutilized.

Social sciences fall under that maybe? I can certainly see a lot of reasons to need a super computer for some social studies.

I could also see uses for analysis of images as it relates to art work. Maybe pattern analysis of written works. Dunno, not saying much of it leaps out at me as "best idea ever!", but that doesn't mean some professor or grad student isn't going to come up with some fun idea in "fine arts" that really could benifit from petascale number crunching.