Posted
by
CmdrTaco
on Wednesday June 23, 2010 @12:43PM
from the that's-a-lotta-flops dept.

coondoggie writes "Not known for taking the demure route, researchers at DARPA this week announced a program aimed at building computers that exceed current peta-scale computers to achieve the mind-altering speed of one quintillion (1,000,000,000,000,000,000) calculations per second. Dubbed extreme scale computing, such machines are needed, DARPA says, to 'meet the relentlessly increasing demands for greater performance, higher energy efficiency, ease of programmability, system dependability, and security.'"

Actually, the military being able to crack encryption is in some sense a Good Thing. It enables them to conduct espionage and counter-espionage against adversaries such as North Korea and Al-Quaeda. Yeah that's kind of a Cold War mentality, but what is "cyber warfare" if not Cold War II?

First, I'm entirely ignorant of supercomputing. I don't know the first thing about it. I'm asking this out of sheer lack of knowledge in the field:

What do you need a computer that fast for?

I mean, specifically, what can you do on something that fast that you couldn't do on one 1,000 (or 1,000,000) times slower? What kind of tasks need that much processing power? For example, you normally hear about them being used for things like weather simulation. Well, what is it about weather simulation that requires so much work?

The whole idea is fascinating to me, but without ever having even been near the field, I can't imagine what a dataset or algorithm would look like that would take so much power to chew through.

It's even more interesting than that. If DARPA begins succeeding a lot, DARPA seniors end up having to explain to congress (yes, directly to congress) why it is they aren't forward-leaning enough. I.e., DARPA programs are expected to fail often, and congress uses this failure rate as pro forma information about how "researchy" DARPA is.

You almost certainly don't want to wait 114 years to get your results.

You know, back in the day, we had some patience. Plus, the notion that one would have to wait 114 years to get results made us develop better algorithms, not just throw cycles at a problem. Kids these days... Now get off my lawn!

I don't recall the exact numbers, but I was amazed a while back when talking to weather scientists just how "blocky" even the best computer simulations of weather still are. IIRC, the simulations treat a square 2 kilometers on a side (or maybe it was 4 km) as a homogeneous unit. Of course, weather changes happen on much smaller scales than that. It was explained to me in terms of pixels on a screen -- the more you have, the more accurately the picture reflects reality -- but it also takes much, much more computing power. Imagine if the simulations used squares 1 meter on a side!

Actually, there are only a handful of variables in a weather simulation. For a typical cloud-scale simulation you have the three components of wind, moisture, temperature, pressure, and precipitation variables. Say, 13 variables. That is not why you need supercomputers.

The reason you need supercomputers to do weather simulations is all about resolution, both spatial and temporal. Weather simulations break the atmosphere into cubes, and the more cubes you have, the better you resolve the flow. All weather simulations are underresolved; to properly model the turbulent flow in the atmosphere you need to get down to cubes that are roughly a centimeter on a side. As you double the resolution (halve the length of each of the four lines that makes up a cube face) you require eight times as many cubes. In weatherspeak, we talk about gridpoints instead of cubes where it's understood that each gridpoint represents the center of one of these cubes. In the computer model, they are represented as three dimensional floating (or double precision) point arrays. So take a 3D array and double the number of calculations on each of the thee for: loops, and you've got eight times as many calculations and eight times more memory required.

And it gets worse. When you double the resolutions, you need to halve the time step. Weather models step forward in time in discrete intervals, and now in addition to more calculations for each time step (eight times as many for doubling the resolution in three dimensions) now you need to go in steps that are half as large. This means 16 times more calculations, and eight times as much memory, to double the resolution.

And many of the calculations that are being made in the innermost loop involve things like divides, non-integers powers, square roots, etc... expensive calculations. And then because it's a massively parallel simulation, you have to do internode communications - which adds overhead and can be rather a bother. Then there's the hundreds of TB of data the model is dumping to disk. Now let's render that, shall we? Somebody call Pixar.

I am working on a project to simulation a thunderstorm which will produce a tornado in a "natural" way. The tornado needs to be adequately resolved. This simulations will have grid spacing of 10 meters. It requires a computer which hasn't been fully built yet (Blue Waters, in Urbana, google it). The time step will be 0.01 seconds, and the model will run for two hours of model time. It will take days of wallclock time. Keep in mind this model will have a physical domain not much bigger than about half the area of Oklahoma. Imagine global climate modeling now, and now you're talking 4 km resolution being all you can do.

This is why we need supercomputers to do high resolution weather simulations.