Posted
by
timothy
on Tuesday February 03, 2009 @09:46AM
from the because-the-nsa-hates-drm dept.

eldavojohn writes "When it's built, 'Sequoia' will outshine every super computer on the top 500 list today. The specs on this 96 rack beast are a bit hard to comprehend as it consists of 1.6 million processors and some 1.6TB of memory. That's 1.6 million processors — not cores. Its purpose? Primarily to keep track of nuclear waste & simulate explosions of nuclear munitions, but also for research into astronomy, energy, the human genome, and climate change. Hopefully the government uses this magnificent tool wisely when it gets it in 2012."

No. No I can't. I can't imagine a beowolf cluster of one those. Even if Natalie Portman (covered in grits) was in my base killing my overlords like an insensitive clod. Even if netcraft confirmed it, then it confirmed netcraft in soviet russia. Especially if Cowboy Neal gave me a three step plan leading to profit I could not imagine it.

A group of computer scientists build the world's most powerful computer. Let us call it "HyperThought." HyperThought is massively parallel, it contains neural networks, it has teraflop speed., etc. The computer scientists give HyperThought a shakedown run. It easily computes Pi to 10000 places, and factors a 100 digit number. The scientists try find a difficult question that may stump it. Finally, one scientist exclaims: "I know!" "HyperThought," she asks "is there a God?" "There is now," replies the computer.

In all seriousness, how much processing power would it take to run a program that designs newer and better processors? I would think that 20 petaflops and a good algorithm would be able to produce a processor that is an improvement over the current generation. Then again, I know next to nothing about processor design, so I could be totally wrong.

The problem is not the hardware, it's the software. Who's going to write the initial algorithm?

Ok, given enough processing power you could do a genetic algorithm for processor design that actually provides useful solutions within a reasonable amount of time, but I have a feeling we're far from that point.

Hm, it's all about getting the right fitness function, isn't it?The processor that would be more fit would: draw less power, compute stuff faster, be cheap to produce, etc. Then it could either have a compatible instruction set, or a new one; in case of a new one, it would have to be able to come up with a way of automatically translating stuff from the old instruction set, or targetting a compiler at it.

The case with the new instruction sets sounds really, really interesting. I think the actual hardware de

Now, if only we could produce the software Y'know, the set of "good algorithms" to produce the layouts and the other set of "good algorithms" to test fitness and... everything else necessary to automatically produce solutions.

*That* seems to be the hard task at the moment. I don't design processors either, so I don't know what types of issues the current design software has but to me it seems that this is probably the hurdle they are facing on that front, not processo

Ahh, Excel... the first choice in corporate database management systems.

How many other slashdotters work at fortune XXX firms where on paper some executive bean counter says "we use oracle" but on the ground all databases are done in Excel (along with a smattering of everything else?)

It is a step up from three jobs ago, where at another fortune XXX the database management system of choice was what boiled down to an administrative assistant and Lotus's word processing solution. Yes we used plain english to request that Patti make changes instead of sql update statements. Also our sql select statements always began with "hey Patti, could you look up...". Any yes, all "ORDER BY" stanzas were in fact powered by swear words and performed by cut and paste.

Yes, yes it does. The issues whilst similar for multi-socket and multi-core systems are different due to the single processor having links to system bus and main memory shared between the cores, where as these are separate links on different processors. So as nomenculature goes it is not that bad at all.

Another reference article:
http://www.eetimes.com/news/design/showArticle.jhtml?articleID=213000489 [eetimes.com]
Mentions "up to" 4,096 processors per rack. So, at maximum, this would be 393,216 processors.
Perhaps they are quad cores and someone took the liberty of multiplying the 393,216x4=1.6M (rounded).
A more reasonable assumption may be 100,000 quad-core CPUs (400,000 cores). That would make the summarization of by only 16 times, lol.

"BlueGene/P uses a modified PowerPC 450 processor running at 850 MHz with four cores per chip and as many as 4,096 processors in a rack. The Sequoia system will use 45nm processors with as many as 16 cores per chip running at a significantly faster data rate.

Both BlueGene/P and Sequoia consist of clusters built up from 96 racks of systems. Sequoia will have 1.6 petabytes of memory feeding its 1.6 million cores, but many details of its design have not yet been disclosed."

There we go. It is 100,000 processors, with 16 cores each (yes, a core is a processor, but since the summary went out of its way to make this distinction, we should continue to do so for a fair comparison).
Summary is wrong (big surprise there).

The article got it mostly right. It mentioned 500-teraflop once, but every other time it spelled flops correctly. Slashdot, on the other hand, fucked up the title, despite the fact that it pretty much just copied it from the article (poorly).

No, genius. His point is that "flops" is singular, with the s standing for second rather than forming a plural. His point is that you say 2 petabyte hard drive and 500 teraflops machine because those are the singular forms.

Heck, in 3.5 years, your desktop computer will be 4 times more powerful than anything currently running today, too.

For being so picky about the terms in the article, you are quite lax with your own. I seriously doubt my desktop, in 3.5 years, will be able to do ~ 6 petaflops.:) (4x more powerful than "anything" currently running today)

Furthermore, 20 vs. ~1.5 petaflops is a goodly sized jump for 3ish years, isn't it? Computer speed growth has seemed to be slowing lately, with an emphasis being on multiple cores, not faster clock speeds like it was 10 years ago. So being able to get 20x the power of the current super

It's obvious that I was referring to desktop computers with the "anything running today" wording.

If you're going to be that intentionally disingenious, why don't you also say that I claimed desktop computers were going to have 4000+ horsepower, since there are industrial earth moving equipment engines that currently put out over 1000.....wait a minute.....

Why can't we let private industry own the computer and the government just purchase time on it? I for one would love to have CGI movies rendered in better-than-real time. This way, us the taxpayers don't have to pay for idle time.

Also, I can design a database using SQLite with a web front end for keeping track of uranium or anything else for that matter. As long as it is not measured in individual atoms, it'll run fine on my spare 2.4 Single core celeron. There is no need to update the database 100M times a

I would guess this beast will never be 100% operational at any moment of its existence.

I'm guessing the "cool" part of this won't be the bottomless pile of hardware in one room, but how they maintain this beast. Just working around one of the million CPU fans burning out is no big deal, but how do you deal with a higher level problem like one of the hundreds of network switches failing, etc?

Higher than you're guessing. I've worked on BlueGene/L, BlueGene/S and was involved in some of the development on BlueGene/P. All of these systems have an incredibly agressive monitoring mechanism - voltages, temperatures, fan speeds, as well as half a dozen other hardware categories are monitored at the component level and the data stored in a database where it is analyzed to ensure that the system as a whole IS operational and stays that way.

Yeah, but an MPI job can't recover from a failed node. Except for checkpointing, of course. So if you launch a job on all those 1.6e6 processors, they all better on average stay up at least long enough that you make some progress and write a checkpoint before one node crashes.

Hopefully the government uses this magnificent tool wisely when it gets it in 2012.

SCENE: The Pentagon, 2012

Science Advisor: "President Whoever-You'll-Be, IBM has completed our 20 petaflop computer. It is awaiting your command."President Whoever-You'll-Be: "Thank you, Advisor. We can use it to compute the long-term effects of nuclear waste disposal, weather fronts, and... just... just how much processing power is in this?"SA: *deep sigh* "Over 1.6 million processors and a total of 1.6TB of RAM, sir."PWYB: "My GOD, Advisor. Do you know what that much power could do? It... it could...

"IBM reckons its 20-petaflops capable Sequoia system will outshine every single current system in the Top500 supercomputer rankings"

So the computer will be ready in 2012, and it will outperform computers from 2009?

These multi-year computer construction projects seem very problematic given the pace of change in technology. Memory changes, CPUs change, and the socket specs change — if it takes 3 years to build, it will be obsolete before it's ready. 2012 could be the year that ATI releases 10-petafl

I recall this is some sort of named ad-hoc "law". When the amount core memory falls significantly below speeds, the kinds of computing you can do is severely limited. I believe they mainly plan simulations, where gigaflops per output point is typical and memory needs not as much. Data processing certainly desires balanced memory.

It could also be used to search for "suspicious behaviour" by searching Government databases, Credit card companies' databases, credit bureau databases, Choicepoint's, telecommunication companies' databases, airlines, and any other firm that the Government bullies into giving access.

Well, that's not as paranoid as you might think. The case against is quite simply the publicity that's been given to this behemoth of a machine, so I really don't think it's too likely in this particular case.

However this is EXACTLY how you go about putting together a machine for intelligence purposes. The key to running an intelligence service is deniability at as many levels as possible, and keeping anyone from seeing the big picture.

I think using a BlueGene for run-of-the-mill data processing would be a horrible waste of money. There's simply no need for things like a parallel filesystem or PB of RAM or low-latency interconnects. You want to "scale out" for distributed processing like you're talking about, not "scale up".

You don't need 20 petaflops to do that, you need a few tens of teraflops and a really really huge memory and really really fast IO. You'd do much better with some of the 1/4TB memory systems from Sun or IBM + spending a huge pile of money on SSDs than a real supercomputer.

The cost of the IO interconnect is a huge chunk of cash to sink into a supercomputer that you just don't need for that sort of tin foil hat application.

This computer would however be really good at brute-forcing crypto keys...

Not really, 2^N gets big fast. The sun won't output enough energy over its entire lifetime to allow a maximally efficient computer to even count from 0 to 2^256, let alone try to brute-force a 256-bit key. (From Applied Cryptography which I don't have in front of me).

Would you rather them set off nukes to study these things? The reason is that it takes a crapload of calculations to map out every reaction between molecules in an area measured in square miles. (and because of a test ban treaty we signed, we can't set off 'real' nukes to test anymore, so we have to simulate it')

Uh, do you know how many molecules there are in "square miles"?
Do you know how many atoms you can calculate reactions between using both state of the art supercomputers and quantum chemistry tools?
Do you know the scaling behavior of quantum chemistry methods?
I mean, even with this new supercomputer, your estimate is off by ridiculously many orders of magnitude.
And that being said, nuke simulation has little to do with quantum chemistry anyways.

And that being said, nuke simulation has little to do with quantum chemistry anyways.

So why did you bring it up? The parent didn't, I don't get what you are saying when you ask a question, relate it to the parent's post, and then say it is irrelavent. You might as well have asked him if he knows how tight the car you drive corners, if you are going to say it is irrelavent anyway.

And, do you realize how much processing power 20 petaflops is? That's insane, I'm having a hard time wrapping my head around it. That is well into the territory of the number of molecules in a small object. There

The parent said that the computer will be used for "mapping every reaction" between molecules. Presumably, since reactions tend to require quantum mechanical descriptions, I guessed the parent meant that the new computer would allow doing such calculations for all reactions in a rather large area.

I don't get what you are saying when you ask a question, relate it to the parent's post, and then say it is irrelavent.

Just a gedanken experiment to amuse myself, while noting that it actually has nothing do with simulating nuclear weapons. Don't get too worked up about it.

And, do you realize how much processing power 20 petaflops is?

Yes, it's about 2 orders of magnitude more than the supercomputer I'm using at the moment. A lot for sure, but still limited to very small system sizes for quantum mechanical calculations. At the moment, even the best methods in practice scale as N^3 or so. With my current 100 TFlops I might do a DFT calculation with O(10000) atoms or so. Two orders of magnitude more CPU power with N^3 scaling gives me roughly a factor of 5 more atoms. 50000 atoms fit into a box of roughly 10x10x10 nm (depending on the material etc., of course). Still a way to go until I'm able to do "square miles"..

If you want to go into classical molecular dynamics, then you're obviously in much better shape. With the current supercomputer that's maybe around 1E9 atoms, and since MD scales linearly, with two orders of magnitude more flops it means around 1E11 atoms. Now these fit into a box on the order of 1 um**3. Again, still quite a way to go to square miles..

In conclusion, atoms are really really tiny, and in 3 dimensions you can pack a lot of them into a very tiny volume.

Also, they so far have not needed to calculate what a nuclear bomb does for each atom (obviously, since it has been nigh impossible), and they probably won't ever need to really. You can study waves and energy effects in great detail, and simulate them accurately, without needing to know where each and every atom goes. This will simply let them be more precise and accurate, as well as speedy.

Yes, that was sort of implied in my previous post. The US nuke labs have been at the forefront in research on numerical methods in topics such as shock propagation (PPM and methods like that) and really really large FEM simulations. Obviously, the actual nuclear reactions are taken into account probabilistically rather than the full quantum mechanical treatment (as my above monologue shows, such a treatment for the primary is far beyond any computer in sight). AFAIK they use Monte Carlo neutron diffusion rather than the classical multigroup diffusion methods that AFAIK are still largely used for civilian reactor design. That being said, I'm sure they are doing a lot of atomic and quantum level simulations as well for small model systems designed to e.g. extract parameters for continuum simulations and such.

The real question is why are we still designing new nukes? Do they go obsolete after a few years? What are they tweaking the designs for? Better yield? We can send the planet into a nuclear winter already with what we've got.

Yes, nuclear weapons have a shelf life due to the components included in them - explosives, chemicals etc.

New designs are used to maximise yield per mass, enabling you to throw a smaller warhead at a target, which means less chance of interception. It also means a smaller package to maintain, and cheaper to build, along with more warheads per unit of material.

Interesting. There's still the matter of the software. Currently, these machines are running straightforward simulation software. I don't see weather prediction resulting in sentience any time soon. The machines will probably have to grow considerably before the more flexible software subsystems like say the load-balancing code, achieves sentience, and destroys us all.

It's not a typo. A weather cock is a wind vane in the shape of a rooster. They were exceedingly common on barn roofs for decades (if not centuries) and are still fairly common today. Wikipedia image. [wikipedia.org] They tend to be inaccurate in low windspeeds because they're relatively heavy and their pivots are often poorly maintained.