Slashdot videos: Now with more Slashdot!

View

Discuss

Share

We've improved Slashdot's video section; now you can view our video interviews, product close-ups and site visits with all the usual Slashdot options to comment, share, etc. No more walled garden! It's a work in progress -- we hope you'll check it out (Learn more about the recent updates).

wiredmikey writes "Supercomputer maker Cray today said that the University of Illinois' National Center for Supercomputing Applications (NCSA) awarded the company a contract to build a supercomputer for the National Science Foundation's Blue Waters project. The supercomputer will be powered by new 16-core AMD Opteron 6200 Series processors (formerly code-named 'Interlagos') a next-generation GPU from NVIDIA, called 'Kepler,' and a new integrated storage solution from Cray. IBM was originally selected to build the supercomputer in 2007, but terminated the contract in August 2011, saying the project was more complex and required significantly increased financial and technical support beyond its original expectations. Once fully deployed, the system is expected to have a sustained performance of more than one petaflops on demanding scientific applications."

Along with the cray they are upgrading (#3 in the world now, will be #1 when complete) and the one lockheed martin ordered (3 days ago) this is the third supercomputer that was ordered in the last 3 weeks to use opterons (bulldozer 16 cores).

the cpu sucks so much that, it is exclusively dominating the SUPERcomputer market.

Designing supercomputers involves a lot of investment in inter-CPU messaging and memory sharing. Once a supercomputer-vendor has committed themselves to a platform, it's not easy to migrate to another. Given the volumes they sell, design costs will have to be spread on just a few actual installations. Maybe AMD was the best platform to use when these computers were originally designed, but they are outdated now. The fact that these new AMD CPUs will work in "ancient" sockets and use the same interconnects, will make development cost for a performance upgrade lower.

Obligatory car metaphore: Most car manufacturers put old technology in cars they bring out today as well, just because the cost of developing new technology and building production lines is commercially prohibitive.

AMD has held an advantage in systems with more than 4 sockets for a while. You just don't see that many Intel x86 based systems with a dozen or more sockets. Itantium tends to be used on those systems.

Not sure if they count as systems, but Intel has about 75% of the TOP500 list with AMD about 13%. And that's coming from a period where AMD has had really strong Opterons. But then I don't think each node has a dozen or more sockets...

The Xeon has gotten much better but I will bet that a lot of those systems use smaller SMP nodes coupled with Infinityband or some custom interconnect as their structure where the AMD systems use larger NUMA Clusters linked with Infinityband or some custom interconnect. That is just my guess of course.

Trolling? OK I'll bite. "So you don't think it has anything to do with the fact that the energy density of gasoline is 12,200 Wh/kg while lithium ion batteries have an energy density of 439 Wh/kg plus a limited charge cycle of ~1000?"

Cars are still gas-powered because of massive collusion. They are now finally starting to bring out production EVs because China was going to do it sooner or later -- they're producing electric vehicles of all types as fast as they possibly can, most of them suck but it's only a matter of time as there's a lot of problem-solving to do, but EVs are conceptually simple. But the big three automakers can agree on one thing, and that is that EVs are bad for their bottom line. The dealers for the Big 3 depend on

The one big problem i have is that GM already did it. The EV1 is still the mark we homebuilt EV guys compare to. All they had to do for today's market is extend it slightly, add a back seat, and drop in some Lithium batteries.

All they had to do for today's market is extend it slightly, add a back seat, and drop in some Lithium batteries.

And make it meet the federal crash test standards that they helped write in order to keep cars like that off the market, in order to step on the people of California's attempt to improve our air quality through vehicle emissions reductions.

It is true that it costs a lot to switch processors, but lets remember that HPC systems are also very price sensitive. Blue waters will have more than 50,000 processor sockets in it. Xeon processors may be better than opterons, but they also cost a LOT more. Multiply that by 50,000. In the benchmarks I've seen, 16 core opterons almost compete with 8 core xeons in raw performance, but blow the xeons away in price/performance.

Because of AMD's design glue logic is cheaper. You can see this reflected in the cost of all levels of motherboard for both AMD platforms vs. their intel competition. This is especially important in a supercomputer. AMD has been easier to build into massively parallel systems for longer. The intel processors are slightly snazzier dollar for dollar, but not that much more amazing. Therefore there are only two reasons you would use intel over AMD to build a supercomputer (cluster size, maximum power limitatio

Obligatory car metaphore: Most car manufacturers put old technology in cars they bring out today as well, just because the cost of developing new technology and building production lines is commercially prohibitive.

Not quite - car technology lags behind the marketplace because type acceptance on electronics takes years. A new engine can take six or seven. Especially on low-margin cars, like compacts. A single warranty recall can blow the profit margin on an entire production run. They want the latest tech in their products, but they aren't going to throw profit out the window if it breaks.

Ever since reading Jurassic Park, I've always wanted a Cray supercomputer. No other super computer company had a hand in bringing dinosaurs back to life. Once you've resurrected dinosaurs I don't think that can be topped. I wonder if U of I is planning on doing any dinosaur resurrections with their new super computer.

Indeed. The movie was garbage. Spielberg at his worst. Turned a great story into a "Hey, it's full of computer animated dinosaurs!" movie. Changed a lot of characters, too. What's his problem with a lawyer being a fairly decent character, not all of them are scum.

Seymour Cray wasn't just a engineer, he was also a marketing showman. He designed the supercomputer to have form as well as function for the express purpose of capturing market share among those who held the corporate purse strings. Even the Fluorinert cooling feature was turned into a work of art with that whole waterfall reservoir thing.

It's worth noting that the "new Cray", while they obviously don't make the old vector processor systems that they did originally, makes a really nifty hybrid cluster/SSI (single system image) supercomputer that is notably different than most of what's on the market. Man, seeing articles like this makes me want to get back into HPC stuff. I'm making a bit more doing this corporate crap, but I really miss getting to play with the cutting edge stuff.

They didn't just buy the name. They also bought all of the people who designed and built those earlier Cray machines. There are still people at Cray who had a hand in the original Cray 1. It's actually a rather nice mix of expertise, multithreading experience from the Tera side, scalable MPP and vector experience from the Cray Research side.

It's far more likely that you just have better things to do with your time now. It's ok, you'll be able to read reviews from people who still have nothing better to do with their time (or get paid to do it) to help make that decision.

Last time I was at the air and space museum in Washington DC I saw a Cray Supercompter http://www.nasm.si.edu/collections/artifact.cfm?id=A19880565000 [si.edu]
I was extremely excited and tried to show my kids who only saw a very weird big computer thing. A new supercomputer built by Cray sounds like a great idea:)

Cray has had Supercomputers on the top ten list (and even in the number one spot) again for years now. Ever since they spun off from SGI they've had one of the more interesting architectures in HPC. I was interviewing at ORNL when they were installing Jaguar [ornl.gov], and I got a pretty in depth description of the hows and the whys. It's no longer the most powerful computer in the world, but it's still a very impressive piece of machinery. Sigh. I really need to get back into HPC.

Names like Cray and Silicon Graphics are associated with a time when most of us could only imagine what incredible technology existed behind closed doors, inaccessible to mere mortals. Now all the excitement is behind commodity items that sell for $500 or less. It's fantastic. Yet, where's the mystique? I miss it.

The mystique is in scaling. It's very hard to run codes on hundreds of thousands of cores and get decent performance. Communication is a huge problem which is why you still see custom interconnect on the high-end systems. Memory architectures on these machines are pretty exotic. It's not just about having a fast processor. It's more about making sure that you can feed that fast processor.

I would normally say, "This isn't your father's IBM", but with respect to Mr. Buffett's age, I'm not sure it is his father's IBM, either.

In the 60's and 70's IBM was the company to work for.

In the 80's they began cutting.

In the early 90's they were slashing. We were trying to buy an RS6000 and from week to week I didn't know who I was talking to as the people were exiting so fast. When I ran into difficulty with a security flaw I found myself talking to someone from IBM in Australia who had them send me

I've been working with an agency who contracted a large project to IBM a few years ago. The results have been... unimpressive. The training was largely a waste of time, I don't believe they even understood their audience.

Better to see Cray, I think as IBM is shopping out a bit too much of their work to people who aren't up to it.. unless IBM has seen the light.

As covered earlier here [slashdot.org], IBM backed out of the contract because they thought they wouldn't be able to meet the performance requirements for existing codes. They were concerned about clock speeds (POWER7 [wikipedia.org] runs at 4 GHz). POWER7 excels at single thread performance, but also in fat SMP nodes.

What NSCA ordered now is system that is pretty much the antipode to the original Blue Waters: the Bulldozer cores are sub-par at floating point performance, so they'll have to rely on the Kepler GPUs. Those GPUs are great, but to make them perform well, NSCA and U of I will have to rewrite ALL of their codes. Moving data from host RAM to the GPU RAM over slow PCIe links can be a major PITA, especially if your code isn't prepared for that.

Given the fact that codes in HPC tend to live much longer than the supercomputer they run on, I think it would have been cheaper for them to give IBM another load of cash and keep the POWER7 approach.

And what, pray, is wrong with "maths"? It is a contraction of mathematics, plural. I agree with the use of "codes" however as while it is syntactically well formed, a better and more correct term to convey the meaning is "orders". For your information, you spelled "furcating" incorrectly, by the way.

Well, they won't have to completely rewrite all of their codes thanks to OpenACC [cray.com]. They will probably still have to do a bit of restructuring (and that's not a small task) but the nitty-gritty low-level stuff like memory transfers should be handled and optimized by the compiler.

That's similar to what PGI [pgroup.com] is doing. And you know what? It's not that simple. You seldom achieve competitive e performance with this annotation type parallelization, simply because the codes were written with different architectures in mind.

This is also the reason why the original design did emphasize single thread performance so much. The alternative to having POWER7 cores running at 5 GHz would have been to buy a BlueGene/Q with much more, but slower cores.They didn't go into that avenue because they kn

You seldom achieve competitive e performance with this annotation type parallelization, simply because the codes were written with different architectures in mind.

The nice thing about this is the restructuring one does for GPUs generally also translates into better CPU performance on the same code. So one can enhance the code in a performance-portable way. That isn't possible to do without compiler directives. With directives, one can avoid littering the code with GPU-specific stuff.

They didn't go into that avenue because they knew that their codes wouldn't scale to the number of cores well.

This article [hpcwire.com] explains that five years ago when NCSA made the bid

This article [hpcwire.com] explains that five years ago when NCSA made the bid, accelerators were very exotic technology. The move toward GPUs was actually at the behest of scientists who now see a way forward to speed up their codes with accelerators. Technology shifts and we adapt.

If they are so willing to adapt, why weren't they willing to accommodate IBM's change requests? It's not like IBM was totally unwilling to build a $200 million machine.

None? I know of several. It's all still in its infancy of course, but I'm convinced it's possible to get good speedup from GPUs on real science codes. It's not applicable to everything, but then that's why they aren't CPUs.

I was referring to annotations for GPU offloading. Codes that run on GPUs are in fact so common nowadays that in fact you'll be asked on conferences why you didn't try CUDA if you present any performance measurement sans GPU benchmarks.:-)

If they are so willing to adapt, why weren't they willing to accommodate IBM's change requests?

I don't have any knowledge of what those change requests were, so I don't know the answer. Everything I have read indicates that IBM wanted too much money.

It's not like IBM was totally unwilling to build a $200 million machine.

From what I have read, it seems that they were. They couldn't keep their costs low enough to justify the expense.

I was referring to annotations for GPU offloading. Codes that run on GPUs are in fact so common nowadays that in fact you'll be asked on conferences why you didn't try CUDA if you present any performance measurement sans GPU benchmarks.

Ah, I misunderstood. I don't think directives have been around all that long (PGI's earilier directives and CAPS's directives come to mind) and they certainly weren't standardized. OpenACC, like OpenMP, should allow scientists to write more

I don't have any knowledge of what those change requests were, so I don't know the answer. Everything I have read indicates that IBM wanted too much money.

From what I have read, it seems that they were. They couldn't keep their costs low enough to justify the expense.

True, but only because of the strict requirements of NCSA. If they had been willing to change them, a BlueGene/Q would have been viable.

Ah, I misunderstood. I don't think directives have been around all that long (PGI's earilier directives and CAPS's directives come to mind) and they certainly weren't standardized. OpenACC, like OpenMP, should allow scientists to write more portable accelerator-enabled code. In fact the OpenACC stuff came out of the OpenMP accelerator committee as explained here [cray.com]. I think it's highly likely some version of it will be incorporated into OpenMP.

The reason why I'm so allergic to annotation based parallelization is the experiences folks had with OpenMP. The common fallacy about OpenMP is that it is sufficient to place a "#pragma omp parallel for" in front of your inner loops and *poof* your performance goes up. But in reality your performance may very well go down, unless your code is embarrassingly parallel. In

The bulk of the new blue waters will be pure opteron nodes, with only 35 of the 270ish cabinets using GPUs. They obviously are assuming that most users will be running with only x86 cores. They ordered a few dozen cabs of GPUs, probably with an eye to where the industry will be heading over the lifetime of the machine, not where the users are today.

It's true that interlagos cores are a poor competitor to power7 core to core. However, they fair much better if you use the entire module. Think of interlagos as