Share This article

IBM has taken the wraps off the first servers that are powered by its monstrously powerful Power8 CPUs. With more than 4 billion transistors, packed into a stupidly large 650-square-millimeter die built on IBM’s new 22nm SOI process, the 12-core (96-thread) Power8 CPU is one of the largest and probably the most powerful CPU ever built. In a separate move, IBM is opening up the entire Power8 architecture and technical documentation through the OpenPower Foundation, allowing third parties to make Power-based chips (much like ARM’s licensing model), and to allow for the creation of specialized coprocessors (GPUs, FPGAs, etc.) that link directly into the CPU’s memory space using IBM’s new CAPI interface. You will not be surprised to hear that Nvidia, Samsung, and Google — three huge players among hundreds more who are beholden to Intel’s server monopoly — are core members of the OpenPower Foundation. The Power8 CPU and the OpenPower Foundation are the cornerstones of a very big, well-orchestrated plan to finally put an end to x86’s reign, and place a fairer, more powerful architecture at the head of the server table.

First, we should talk about the new Power8 chip. There are 12 CPU cores, each with 512KB of L2 SRAM and 8MB of L3 EDRAM, for a total of 6MB L2 and 96MB L3 cache respectively. There is then a further 230GB/sec of bandwidth to 1TB of DRAM. Whereas each Intel Xeon core is capable of two-way simultaneous threading, and Power7+ cores can do four threads, Power8 ups the ante to eight simlutaneous threads (SMT). As you’d expect, other parts of the chip have been similarly expanded to cater for the Power8’s massive parallelism: There are eight decoders (up from 6), six dispatches per clock cycle, a doubling of load units (4), the data cache can now process four 128-bit transactions per cycle, and the bus width between the L2 and data cache is now 512 bits. Take a look at the block diagram below and be awed by its massive parallelism and throughput.

We expect the Power8 will eventually be capable of clock speeds around 4.5GHz, with a TDP in the region of 250 watts. At this speed, the Power8 CPU will be around 60% faster than the Power7+ in single-threaded applications, and more than two times faster in multithreaded tasks. In certain cases, IBM says the Power8 is capable of analyzing Big Data workloads between 50 and 1,000 times faster than comparable x86 systems (the same amount of RAM, the same number of cores).

Compared to its competitors (Power 7+, the Oracle Sparc T5, the Intel Xeon), the Power8 is anywhere between two and three times more processing power per socket. This is mostly due to the massive thread count (96 vs. 30 for the latest 15-core E7-8890 v2 Xeon), and utterly insane memory bandwidth (230GB/sec vs. 85GB/sec). In terms of performance per watt, though, the Xeon (~150W TDP) is probably just ahead of the Power8 — but in general, when you’re talking servers, power consumption generally plays second fiddle to performance density (how many gigaflops you can squeeze out of a single server).

IBM Power8 CPU die, labeled

Beyond raw SPECint and SPECfp performance, Power8 also introduces CAPI (Coherence Attach Processor Interface). CAPI is a direct link into the CPU, allowing peripherals and coprocessors to communicate directly with the CPU, bypassing (substantial) operating system and driver overheads. CAPI is similar to Intel’s QPI, but where QPI is closed and proprietary, IBM is opening up CAPI to third parties. IBM’s Power Systems CTO, Satya Sharma, told me in an interview that in the case of flash memory attached via CAPI the overhead is reduced by a factor of 20. More importantly, though, CAPI can be used to attach coprocessors — GPUs, FPGAs — directly to the Power8 CPU for some truly insane workload-specific performance boosts. It is due to these CAPI-attached coprocessors that a Power8 system can be 1,000 times faster than a comparable x86 system.

There’s an inevitable diminishing marginal return to all software. Amdahl’s law dictates it. Even multimedia encoding and 3D rendering will eventually be limited by the percentage of the code that can be parallelized (I’m not aware of any practical code that is 100% parallel).

chojin999

3D rendering algorithms can be parallelized much easier than video/audio stream encoding. Although some algorithms are shared at the low level, still encoding video and audio streams it’s just not the same as 3D rendering.

by handbrake you actually mean x264/libavcodec threads, i seem to remember something like 16 threads for 1080p, unknown for UHD-1/UHD-2 perhaps all of them considering it has the bandwidth to push decode at full speed without bottle necks ?

The vast majority of PCs are some shit 500 dollar HP or Dell. That has nothing to do with what I said. I said, what I want.

Tom

Would be cool if CAPI was a socketable interface as well, otherwise a lot of the benefit would be lost in not being able to bolt graphics, memory and storage hardware directly onto the processors without it being built into the board in the first place. You’d just have a pair of hot, inefficient, expensive processors.

Robert Boluyt

With 2x 230W CPUs in your desktop, its more likely you’d just trip your circuit breaker.

Sweetie

People with triple GTX 480 setups and triple GTX 580 setups existed.

Marc Guillot

Great news :-), I would still prefer that ARM was the one ending Intel’s monopoly, after all ARM has always been open, licensing to anyone that desired to build a chip with their ISA. Anyway I’m happy that IBM copies ARM’s business model and allows a more healthy and competitive market, where everyone could compete on even ground.

RIC

IBM would be wise to do more than just copy the arm model, they would be wise also join the Linaro OSS optimization initative , its network initiative and even its new home initiative for end to end coverage, hell we might even get a new
“IBM PC” mark 2 :)

why didnt IBM put a few legacy x86 microcode in there to ease porting old code and emulation/simulation at full speed for legacy xp using corporations etc !

I dunno, I don’t really care. I just want to see someone pressure Intel. Part of me wishes it was AMD, but that’s not going to happen. IBM is just coming out of left field slinging like mad. So this looks promising. I never thought ARM would do much in the higher end world because their processing just isn’t all that, except in the area of power. Power efficiency is fine for a big server bank or for a mobile phone, but that’s about it.

Tom

What I ultimately want is much faster processors and the magnitude of performance increases we used to get through the generations several years ago. How it happens, I don’t mind so much, but it certainly involves poking Intel with a stick, so be it AMD, ARM or IBM, I’ll be happy.

Kind of wish it was AMD as well though – always rooting for them.

http://www.flickr.com/photos/catchphotography/ H23

I think the lack of performance gains have more to do with limits to node shrinks and silicon more than AMD’s lack of a bleeding edge CPU. ARM servers at the low end and Power8 at the high end are definitely going to put the heat on Intel

http://www.korioi.net/ Korios

The ARM v8 ISA looks very promising, and being 64 bit, ARM-57 chips could even compete in the (low end) server market. I hope the deployment of -53 and -57 cores is not delayed.

http://www.flickr.com/photos/catchphotography/ H23

Looks like q4 from AMD, not sure about the other guys

normcf

What I hope to see is a 3 (or more) horse race to provide server chips. This will drive innovation for both more processing power and for less electrical power. Once google, facebook etc. can run their servers on any architecture they will be able to drive the suppliers prices down to compete.

Phobos

That is great news, hope they can put the pressure on.

pelov lov

Great article.

I’m skeptical this is going to pan out the way IBM and partners would like it too, though. IBM has a long history of being very closed about both its designs (mainly Power) and its software ecosystem and support. Although the chips they sell have historically been incredibly expensive, the money they make on software is astronomical. The reason for IBM being backed into a corner in the first place is because of their proprietary approach to their business model. If you went Power, you were going to have to pay obscene amounts of money and you were stuck with them for the long haul.

This isn’t necessarily true for x86, as the software development and ecosystem has barely anything to do with Intel; at least as far as Intel dictating that end and profiting from it immensely. The backlash from server vendors and tech powerhouses has more to do with Intel’s pricing structure on its chips and platforms and how they’ve become increasingly more expensive. For example, Intel Avotons are essentially just Bay Trail Atom cores but they’re priced so close to their Xeon E3 models (even in bulk) that it’s a slap in the face. The die sizes nor the complexity of the platform (and total lack of chipset) would indicate such pricing, but because there are no alternative low-powered x86 cores Intel can get away with whatever pricing they want. (Conversely, they’re giving these things away at $5 per Bay Trail in mobile because of the competition in the consumer space)

Unless IBM is completely open about the entire process — and most importantly that includes the software ecosystem and applications + support at reasonable pricing — I don’t see this going anywhere. IBM will have to offer killer deals in order to increase volume and thus decrease pricing, but what is it that they can offer that ARM can’t?

Joel Hruska

Intel is heavily incentivizing Atom in mobile, but I would not assume a $5 per BT figure. That’s probably low, even for the contra revenue scenario.

pelov lov

They lost a billion dollars in their mobile division just last quarter!

The estimates for contra revenue are pegged about about $51 per device. Intel is also going after the Chinese white box market where $5 is the price of the SoC in those devices.

Joel Hruska

They reorganized mobile into different divisions, which is part of why it’s suddenly losing money. And they’ll drive that contra revenue through marketing matching funds or other forms of cost-cutting rather than simply slashing the price on BT itself.

It’s accounting tricks, to be sure, but they aren’t going to sell BT at $5 per chip straight off. Instead they’ll sell you BT for $31 a chip and give you $20 in marketing funds — that way they don’t dilute the value of BT later when they want to raise price.

Jason Vene

Glad to find someone who posts a treatise like I do. Your point is well thought out.

I’ve followed this over the years.

Around 1986 or so I had a client that was an early adopter of the IBM PC RT, the first RISC product precursor to the PowerPC line. It was 20 Mhz, ran AIX, fitted in a tower about 3 times the size of a full PC tower case but had the standard ISA card slots you would find in a 386 PC. It was an awkward attempt, but it worked.

It was also $15,000 a unit at the time. They had 4.

By 1990 (or so) another client leased an early PowerPC based system, also AIX, around 60 Mhz or so, in a case the size of an small office refrigerator. That client continued upgrading, as the lease terms expired, to the present day in order to run a legacy COBOL based application for manufacturing scheduling.

I kept watching, through the Apple PowerPC era, for these chips to really take off. I thought when XBox 360 and PS3 adopted the platform the time had just about arrived.

When Apple switched to Intel there were entire threads on forums dedicated to the refusal to relinquish Power5 based MAC’s. One poster, complaining about the change of endianness, poetically lamented the torturing of bits in Intel chips. A theme emerged that, somehow, the PowerPC chip was one of the factors lending uniqueness to the MAC platform.

I never even noticed the Power7.

I have to wonder if we’ll ever hear much more about Power8 than magnificent descriptions, a few powerful machines and then…nothing, once again.

Then, in yet another repeat of history, most of the important innovations will probably end up in Intel chips.

wat

Can it run Crysis?

RIC

in this case , SURE, many x86 emulators exist, or you can make a x86/64
co-processor board like the old days…

Kira S

The TDP is 190W according to the datasheets IBM published yesterday – not a rock-boiling 250W like the article says.

http://www.mrseb.co.uk/ Sebastian Anthony

The TDP will depend on the operating frequency. I think it will eventually scale up to 4.5GHz+, but for now most of the new Power8 servers are around 3GHz — thus the much lower listed TDP.

chojin999

The Playstation4 should have been built with a Cell2 based on Power8… oh well, actually a Cell2 with 16-core was reported that IBM manufactured for Sony and that was a real PS4 prototype… Instead of the obsolete very slow AMD APU Jaguar crap the managers released…
But with IBM managers going to sell all the manufacturing plants … and they tried selling everything just a few months ago including the whole Power technology.. why releasing the Power8 now with such a big plan ?
Did they change their mind or are the managers on drugs so much that they have no clue what they are doing anymore ?

Marc Guillot

What for ?, on a console the important bit of hardware is the GPU, and AMD can integrate a GPU with the CPU cores way better than IBM.

chojin999

Keep dreaming. You know nothing. You like to be ripped off my managers .. enjoy and buy the crap they are selling now.

RIC

did you miss the part where anyone can take the power8 open-power option and make a game machine if they have a workable business plan and backers now,and have all the initiative help them to build momentum for the good of all long term.

Phobos

Maybe you were ripped off with the cell cpu been such a mighty cpu when in reality it was not.

Joel Hruska

That’s not true. Cell really *was*an incredibly powerful architecture. It could handle workloads that took conventional chips to their knees, and it’s capabilities are one reason that the PS3 kept pace with the Xbox 360 at the end of their lifespans despite less video memory and a less flexible GPU — you could use Cell for graphics workloads and multiple developers did.

Cell is best understood as an extremely powerful processor whose benefits were ultimately co-opted by programmable GPUs — but it got there first, and it held a unique position in the market.

Phobos

In the end it was just a meh.

Joel Hruska

In what context?

If you mean “The Cell CPU in the PS3 failed to give the console a dramatic leg up over the Xbox 360, despite Sony’s claims,” I agree with you.

If you mean “The Cell CPU in the PS3 failed to have an impact on the general market because it underestimated how many programmers would put up with its unique design in exchange for high performance,” I agree with you.

But if you mean “Cell didn’t do anything interesting and couldn’t perform like IBM and Sony claim,” then I disagree. It absolutely could. It just got other assumptions wrong.

Phobos

you pretty much answer them for me. It was way overhype.

Medallish

In many ways Cell is Sony/IBM’s take at what AMD is now doing with HSA?

The problem for the PS3 was also just very bad documentation, and I’m guessing no co-operation from Sony, a prime example of this would be Skyrim, how long didn’t it take for Bethesda to get it to work properly on the PS3? That and the fact that it was a lobsided processor, you only had one core doing the serial workloads. But definitely objectively it was an amazing processor, but obviously compared to todays HW, well it’s 7+ years old.

chojin999

What makes you think that there ever was a lack of co-operation or technical documents to developers, software houses, uh ?
It’s all about money! Good programmers need time and cost a lot of money. You need more good programmers that want to be paid for their job in order to deliver properly optimized products.
They switched to AMD APU Jaguar to avoid paying for low-lever assembly optimized code. And at the same time paying the least for hardware. It’s just as simple as that.
The world is full of bad programmers, lazy programmers, thieves among managers like bankers…
In order to deliver good to best products the best programmers must be paid to do the job properly and quickly. And in order to train more good programmers then more money it’s needed too.
Software houses managers don’t want to spend money, most of them just want big profits with zero risks.

Medallish

Well obviously unlike you I do know some developers, and I’ve heard some horror stories of developer PS3’s arriving with only Japanese instructions. Well money certainly helps, although I don’t think it’s fair to say simply good developers, you can be a perfectly good developer, but if someone gives you some HW that’s for the most part undocumented, very new and different.

Listen the PS3 was a failure, I like my own PS3, but for Sony it was probably one of the largest headaches they’ve had, and I don’t know any developer(outside a few Sony in-house ones) that praises the PS3 as a good platform to develop for.

The APU inside the PS4 which isn’t catagorized only by it’s CPU, by that logic the PS3’s Cell was a singlecore PowerPC processor. Is a better console platform, and yes indeed it is also about saving money, as again the PS3 was a massive financial failure. But it’s also a lot more powerful than the Cell processor, heck it’s even better than the Cell and the nV GPU combined, we’re talking 7+ year old HW no matter how you twist and turn it the PS3 has nothing on the PS4.

Unless you’re actually a developer yourself, I don’t think you’re at all qualified to talk about any developers work ethics, it’s my understanding that a lot of them are hard working people and suggesting consoles are dictated by “lazy” developers seem like pure speculation on your part. And then there’s the “thievery” comment, I’m surprised you didn’t just say fraud like you usually do, but no of course there’s non of that going on, selling a product below cost is risky, but also incredibly stupid, does any successful market do that?

chojin999

1) You have no clue who I am.

2) You can claim to be the Pope on any blog or internet forum. So are you the Pope? Or are you Bill Gates? Or Sony CEO ? All of them at the same time ?

Please…

The AMD APU Jaguar it’s a silly joke. It’s obsolete hardware. Period.

All your babbling is not going to make it run any faster. Not be able to compete with either what Intel, IBM or even Apple can offer.

Phobos

And who are you? revel yourself .

chojin999

Said the brave anonymous Phobos….

Phobos

You sure make some bold claims, coming from anonymous chojin. Yet what valid do they have?

http://www.flickr.com/photos/catchphotography/ H23

how is it obsolete, when jag and puma continue to get design wins….all the time. Just because it’s not an I7 fighter does not make it obsolete my young padawan

Well notice I very clearly said I knew developers, it’s no one special, but they have told me about some of the headaches regarding the PS3, at no point did I say I have first hand knowledge.

lol Nah, the Jaguar is a fine CPU, the PS4 APU in which Jaguar is only a small part, is also fine and definitely not obsolete.

chojin999

Knowing some bad developers means nothing.

If they aren’t used to code in assembly and do it properly then they aren’t real programmers.

Programming in Java and .NET or Javascript only .. finding C/C++ too difficult to use just means that someone is not a real programmer.

At least not a programmer suited to code high quality consoles or PC games for sure.

Medallish

Right, as opposed to you, knowing no developers at all means everything? You didn’t even know about the PS3 documentation clusterfuck, that shit is even available to read online. I’m pretty sure coding in Assembly has nothing to do with knowing how to program for parallel processors. You’re full of shit btw. I never said anything about what kind of language the people I know code in, but it doesn’t matter really, knowing the language does not make you a great programmer.

chojin999

Rude little kid being envious much.. uh ?

Medallish

Yeah.. Not sure what I should be envious of?

Bloud Mai

Obsolete?

The definition has obviously changed since I last read a dictionary.
If by obsolete you mean being able to jam 8 cores and a HD78XX GPU on a single chunk of silicon.

Yep pretty obsolete. One can only hypothesize the monster of a processor that could be crammed into something the die size of a old school slot A Athlon, or slot P3.

If CPU makers could focus on improving the production process “as opposed to shrinking” the process to mitigate the number of errors that would occur by using the “for example” 22nm process on a die the size of a slot CPU. I can only guess the result would be at least a Jag 24 core w/ a R9-270 on-board that “I theorize” would have to ship with a water-block attached for cooling.

But in nutshell all fan hyperbole aside “all processors” have come a long way, and to deem even the lowliest modern CPU obsolete is a misnomer and disservice to the long hours, and research that goes into their design.

Q.E.D on my part..

http://www.gaminglaptopsjunky.com/ Junky

that was my thought exactly – a comparison to the HSA. From this article it might sound like this is the first direct CPU-GPU interface out there and while it might be better than HSA (?), we need some good comparison

UsamaBinLaden123

Actually, CELL was quite bad. How do I know? If it was as good as IBM told us, it would have been sold even today. But instead IBM has killed CELL. You dont kill a good architecture, instead you work with it and refine it. IBM did not, because CELL was a dead end.

In fact, in string pattern matching benchmark you need 13 (thirteen) CELL running at 3.5GHz to match one single SPARC T2 Niagara at 1.6GHz. This is when workloads are larger than it fit into CELL cpu cache. With small workloads, the benchmarks showed CELL to be 70% faster than the Niagara T2 cpu. With large workloads, CELL dropped 95% of it’s performance. CELL was a terrible architecture.

Dozerman

Why did you reply to him? WHYYYY???

Marc Guillot

:-) :-)

dc

Maybe they hired the old HP team.

Phobos

Why would you want such powerful cpu in a console when it hardly gets use to its full potential? Just look at the PS3.

chojin999

The PS3 is still nowadays very fast and not obsolete at all. Thanks to the IBM/Sony/Toshiba Cell project. It was expensive to design, it was a far more advanced CPU than anything Intel had to offer at the time.

The AMD APU Jaguar on the PS4 and XBoxOne it’s already obsolete, slow and outdated. It’s a joke and worse.

If you enjoy being ripped off go and buy the crap they are selling now.

A 10 years old R&D Cell based system like the PS3 is still has more computing power than the AMD APU Jaguar crap, despite the silly fake numbers AMD,Sony and Microsoft can claim for gullible people.

The truth is that the AMD APU Jaguar it’s 50% slower than the cheapest and slowest Intel Core i3 CPU (dual-core) on average. AMD can put even 256 cores in their CPUs and still wouldn’t be able to match the IPC of either Intel or IBM slowest CPUs.

Joel Hruska

Cell is pointless in the advent of the modern GPU era. I investigated this at great length back a few years ago when I wrote about the PS3’s supercomputer capabilties.

Virtually all of Cell’s strengths are shared by programmable GPUs which have since been commoditized and offer vastly superior raw performance as well as better performance per watt.

I will grant you that Cell was, in some ways, ahead of its time. It was an interesting core and it had some unique capabilities, but the difficulty of using it at peak efficiency was simply too high to justify extending the architecture. Even Los Alamos has since retired the Roadrunner Cell-accelerated supercomputer in favor of a smaller, simpler, and cheaper design that doesn’t require every single application be custom-coded to the unique architecture.

The PS3 and Cell *did* find some significant use for HPC applications, but that ship has sailed.

chojin999

You are delusional if you think that GPUs maybe the APUs nonsense from AMD can replace high-end SIMD (DSP derived) vector units.

The whole Intel Larrabee concept it’s not much different from the IBM Cell one. Although they based it on micro x86 cores mainly.

They have been afraid of entering the discrete GPU market and scrapped that project but the Larrabee is still alive and remarketed for the HPC and high-end server market renamed as Xeon Phi co-processors accelerators cards with big plans for the upcoming future to become a new era second-socket co-processor directly on motherboards like the ancient 8087,80287,80387 were.

And Xeon Phi it’s much faster than anything Nvidia or AMD will be able to offer, no matter how much ARM design they could add to their GPUs.

Guest

Yeah I think he’s actually researched it, and if you looked into it just a little you’d learn that programmable shaders in GPU’s have become quite capable since Larrabee. Larrabee is just an attempt at making a very capable parallel co-processor, Cell was a Heterogeneous main processor, combining Serial and parallel workloads(Sounds familiar?)to run it’s OS, Larrabee was never meant to be anything but an add-in board, it has some impressive features, but it’s definitely not much faster than anything AMD or nVidia has produced.

But this whole thing started with your stupid comparison, where you take AMD’s Jaguar CPU and compare it to the entire Cell processor, that’s kind of unfair, and also moronic, a fucking I7 can’t “compete” with the Cell on raw performance…. And when you compare Cell to the APU in the PS4, it’s a MASSIVE upgrade.

chojin999

What nonsense are you babbling about ?

Other than insulting me you can’t do anything else, can you, uh?

The current Xeon Phi it’s much faster than anything AMD and Nvidia have on the market.

And what’s that silly statement that the AMD APU Jaguar shouldn’t be compared with the IBM/Sony/Toshiba Cell on the PS3 ,uh ?

Unfair WHAT ?

Current Intel Core i7 6core+ CPUs nowadays are much faster than the Cell on raw power.

It’s AMD that sells obsolete hardware, their CPUs clearly suck.
Also you contradict yourself… you end your silly statement telling that the same AMD APU Jaguar you claim it’s unfair to compare to the Cell would be a massive upgrade compared to it ??

Really?

Did you think a bit before writing that nonsense of yours or what ? You were too busy adding insults to your silly statements ?

The AMD APU Jaguar it’s a farce, obsolete hardware.

carol argo

Cell processor?arent those on a 22nm process on sony server for playstation?also isnt ibm selling a cell processor for their own (whatever)would be fun to compare ibm top of the line cell processor with top of the line any!sadly there is no software to make the comparision

Joel Hruska

Sony has developed a PCB board with multiple PS3 SoCs on it for use with PlayStation Now, yes.

http://www.flickr.com/photos/catchphotography/ H23

It’s a low power soc designed against baytrail lower power parts, not core parts, within it’s TDP its very good, Puma even better

chojin999

A console it’s designed to play games at the maximum quality for as many years as possible. At least so were designed in the past.
But thanks to the AMD APU Jaguar farce that is no more. At least they could have used a AMD APU Richland which is much faster, still a lot slower than Intel CPUs but better than the Jaguar crap.
But no.. they wanted to make a fool of customers selling an ultra cheap product at inflated prices.

http://www.flickr.com/photos/catchphotography/ H23

We’ll get many years out this generation of hardware, if you want the bleeding edge graphics get a PC. You seem angry, go home chojin999, you’re drunk

Phobos

The PS4 at $399 doesn’t seem inflated at all if anything, its around $100 cheaper and its GPU is 4.5x more powerful compared to the PS3 when it came out. Now the X1 is inflated.

Joel Hruska

What is this, 2008?

I’m not talking about what might happen. I’m talking about what *did* happen. In 2008, several whitepapers were written for the HPC space talking about how Cell had multiple advantages as a computational accelerator and would gain ground in the TOP500 list. Today, not one TOP500 system uses Cell. Don’t believe me? Look at the co-processor section for yourself:

IBM predicted that a 32-core Cell on 45nm would be capable of approximately 1TFLOP of single-threaded performance. That would’ve been an impressive set of gains, but the company chose not to bring it to market because GPUs were already stealing that segment for themselves.

Today, a modern AMD or Nvidia GPU (or Xeon Phi) is capable of 5TFLOPS of single precision and 1.4-1.7TFLOPS of double-precision. While these are idealized numbers, they remain faster than the fastest Cell IBM ever shipped, with its ideal 102GFLOPS of double-precision floating point.

I promise you that if the HPC market, which emphasizes performance over cost, saw value in the PowerXCell 8i, they’d still be using it.

Phobos

You fail to answer my question and instead once again bash AMD.

http://www.flickr.com/photos/catchphotography/ H23

It’s actually a very power efficient design that is saving us money. Jaguar and its successor Puma are powering more devices then you might imagine with just 4 cores. The CPU’s in the new consoles are a good fit for a console, stop being ridiculous.

chojin999

Either you work for AMD maybe in their marketing dept or you are a kid that enjoys being ripped off. Really. If you like it so much keep buying the crap obsolete hardware they are selling, AMD managers will be pleased.

http://www.flickr.com/photos/catchphotography/ H23

Well that comment is coming from someone who is obviously very upset about something, if you would use some logic and not be bias you could read some interesting stuff. I actually don’t game, I like understanding hardware and reading reviews. If you spent a little time reading you would see that the new AM1 platform is getting great reviews and it is using 2 and 4 Jag cores. Lots of laptops have been getting quad core jags in APU form that are very competitive in the lower price category. The Jag and new Puma cores are meant to maintain power efficiency, it’s not old tech. Having 8 lower power cores in the consoles allows them to run cooler and use less power. This makes them quieter as well. It was a good choice with “close to the metal API’s that allow the GPU to handle even more of the workload. Jaguar/Puma is not a core competitor, it goes toe to toe with low powered Intel solutions very well. The tech community has widely lauded AMD cat cores as a major win for the company that they will continue to develop. Sorry your upset.

Oh and by the way..Hawaii can deliver 5 plus tflop compute, Phi 2 tflop….And IBM/Nvidia combo has shown that GPU in data center supercomputer applications is more viable. So sorry again.

chojin999

The Xeon Phi has been released just a couple years ago.. the AMD and Nvidia GPU crap has been selling for many years.
It takes time for new products to be tested, installed and code optimized for in supercomputing and enterprise/server environments.
The AMD and Nvidia GPUs architectures are so limited that just can’t be faster at all. Those are just fake DSPs. Yes they put a lot of DSP features thru the years but it’s still far from any true DSP architecture.
The Larrabee/Xeon Phi it’s far more advanced and better designed than any AMD or Nvidia GPU to date.
If Intel will market it properly and optimized code will be written for it there will be no chance for AMD and Nvidia to keep selling their expensive crap.

http://www.flickr.com/photos/catchphotography/ H23

Plenty of rumors that they have shut down Xeon Phi all together, The Barcelona R&D facility has been closed and it was dedicated to Phi. I think Its pretty clear that GPU compute is the more economical, capable product right now for the data center and supercomputing worlds. And Phi is far more expensive then a Nvidia tesla, GPU.

That’s the root of this whole thing, people want out for underneath Intel’s price dictation, need competitors for that.

Joel Hruska

Intel has a great deal of plans for Xeon Phi. That architecture is going nowhere. Knights Landing is expected in 2015 (silicon is already in testing).

http://www.flickr.com/photos/catchphotography/ H23

You know how good it is to have x86 in gaming consoles, you’re silly. 8 jaguar cores are adequate and fairly power efficient.

I’m disturbed by how irreverent and senseless your last post was…..Not sure what you are trying to say, everything you just pointed out was irreverent….What do: piledriver, steamroller, ivy bridge, and sandy bridge processors have to do with anything …The “equipment cores” are different then the “cat cores” jaguar is a cat core….

If you think that your cat could became a lion… you are absolutely delusional.

The AMD APU Jaguar it’s clocked at just 1.60GHz. Having 8 cores means nothing. It’s still a very slow CPU designed for netbooks and cheap notebooks.

http://www.flickr.com/photos/catchphotography/ H23

It’s clocked at various speeds depending on the required TDP, and is 2 and 4 core models for netbooks, tablets, am1 desktops. Its an 8 core in the consoles and is adequate for the amount of processing power they need while not using much power, thus not heating. The developers are now developing multi-threaded games to adequately use the processing power while using “close to the metal” api’s to get the gpu to compute as much as possible. Sorry kid

chojin999

Keep dreaming.

It’s clocked so low because it’s a cheap product. AMD needs 5GHz clock to match some Intel CPUs as the benchmarks prove.

Both Playstation4 and XBoxOne thanks to how much powerful the AMD APU Jaguar really is struggle to get a sustained 30fps even at 720p. 1080p 30fps or 1080p 60fps can be reached only by disabling most effects algorithms, including the physics ones or lowering the detail/complexity to the bare minimum settings.

Most games just feature flat textures and low poly count 3D models trying to speed things up as much as possible, and despite all that the AMD APU Jaguar on both consoles just can’t guarantee a sustained stable 30fps framerate. Most games run at 15 to 22 or 26 fps .. the framerate goes up and down continously.

Optimizations will be pretty limited and won’t improve things much.

These are fake next-gen consoles.

True next-gen consoles should have been designed for no less than 4K 120fps sustained framerate at maximum details setting.
And oh well, actually Sony did that.. because the real Playstation4 prototype with a Cell2 16-core CPU was designed and built inside Sony, IBM manufactured the Cell2 16-core CPU for Sony.
But then the managers decided to go with the fake next-gen ultra cheap obsolete AMD APU Jaguar crap instead…

http://www.flickr.com/photos/catchphotography/ H23

You’re comparing different products from different generations in terms that make no sense, you clearly have no clue.

chojin999

It makes perfect sense. The newest AMD APUs and CPUs can’t match 3-years old Intel CPUs performance wise. AMD needs to raise the clock at 5GHz to get some nice benchmarks results but it’s still half the speed of what Intel has to offer.
You are the one that has no clue here.

http://www.flickr.com/photos/catchphotography/ H23

You keep talking about steamroller vs haswell vs broadwell from different generations competing at what? The Jaguar cores! man….. today…. are competitive with the EQUIVALENT! Intel offerings currently.. which is silvermont. So that is Silvermont and Jaguar, low TDP chips okay….not jaguar vs a highend haswell i7 or 4.2 ghz piledriver..lol

Is English your second language, are you tired or something, do you have a search engine, why can’t you get it? Pull up silvermont vs jaguar, it’s close.

Phobos

No less than 4k 120FPS? I don’t know if you are on drugs or just trolling.

Medallish

xD that’s brilliant though, the PS4 should have been the size of a small house!

chojin999

More nonsense. You have no clue what you are talking about.

Medallish

You could have made a console out of nothing but 780Ti’s and I7’s and 4K @ 120Hz/FPS consistently would still be impossible. What you’re suggesting isn’t even possible on PC’s, so how in the holy hell do you imagine consoles would do it? No one can take you seriously after that insane comment xD.

chojin999

I don’t use drugs. Maybe you do.

Medallish

lol you’re so full of shit. CPU performance in a console doesn’t matter, if it did the PS3 should be considered a failure to you as it had horrible singlecore performance.

chojin999

Other than insulting me and writing nonsense you really can’t do much more, uh?
And what is the trick you are using here trying to tell that I would be against the Cell ?
Like I didn’t write anything before about Cell.. and surely nothing negative.
That is a silly childish trick because you are trying to make me say the opposite of what I wrote.. Fact is you can’t succeed in that.

http://www.flickr.com/photos/catchphotography/ H23

I think it’s time to shut it down man, you look ridiculous

http://www.flickr.com/photos/catchphotography/ H23

I’m sorry you can’t get it, I feel bad for you. Going x86 in the consoles gives the developers the ability to design the games much more efficiently across all the platforms (PC Xbone PS4). The developers will and are addressing a true multi-threaded game design approach using the equivalent of an AMD 7850 for graphics. This is plenty of grunt especially when having a streamlined OS and API that utilizes resources effectively. Jaguar is low powered by design not obsolete.

carol argo

Ok !is there an affordable version (like the upcoming 8 core from intel 5930 if i recall)if ibm ws to lunch a counter to the 5930 ,intel might wish ibm had stayed azleep.

Joel Hruska

IBM hasn’t sold chips to mainstream consumers since Apple moved away from the G5. So no, there will be no “affordable” versions.

chojin999

Apple moved to x86 starting from 2005 to 2007.

The Microsoft XBox360 uses a IBM 3-core PowerPC CPU, released in 2005.

The IBM/Sony/Toshiba Cell on the Playstation3 was released in 2007.
The Nintendo Wii uses a IBM PowerPC CPU and it was released in 2006.

The Nintendo Wii U uses a IBM 3-core PowerPC CPU and it was released in 2012.

Dozerman

You confuse POWER and PowerPC.

chojin999

You are the confused one here. I replied to Joel Hruska statement that IBM didn’t manufacture CPUs for the consumer market after Apple switched to Intel.
Apple never used IBM Power but just PowerPC CPUs as well.
Anyway many IBM custom PowerPC designs for OEMs have included some Power technologies.
And the IBM/Sony/Toshiba Cell CPU used a lot of technologies derived from the Power architecture.
So yes, no consumer grade full Power line CPU ever released by IBM but pieces of it have been included inside some PowerPC versions thru the years.

sebastian giacana

wasnt power pc(motorola) it was a power4 processor with just one core

Joel Hruska

None of which qualify as a CPU you can buy in a consumer PC. The question was “Is there a chip like an Intel CPU?” Meaning “A CPU you can buy in a computer for general purpose computation?”

The answer is, no. I never said IBM stopped manufacturing processors, I said IBM stopped manufacturing mainstream consumer chips.

chojin999

??? Seriously ?
All those were/are IBM PowerPC customized CPUs for OEMs sold in the consumer market.
They really are just like the Intel ones. Or ARM based ones found in smartphones and tablets.
It’s not that just because IBM is not directly selling its Power or PowerPC CPUs to end-users in the consumer market then the ones on consoles could be regarded as enterprise/server class CPUs nor automotive/industrial either.
Consoles are consumer devices. All those PowerPC CPUs are for the consumer market.

Joel Hruska

You cannot buy any of those CPUs in a PC, laptop, notebook, or tablet. You can target them for development only if you are working in highly specialized ecosystems. Sony killed the Other OS community around the PS3, neutralizing that avenue, and MS never supported one.

You can buy a PowerPC processor in a specialized box with no general-purpose access to the underlying computer hardware. If you want to develop for a console, obviously that’s not a problem. But there’s no way to walk into a store (virtually or otherwise) and walk out with a consumer PC running a PowerPC architecture.

chojin999

You keep replying the same thing over and over.

PCs, notebooks, tablets are not the only consumer devices on the market.

Consoles are consumer devices just as PCs, notebooks, tablets, smartphones.
You can’t keep claiming that there are no IBM PowerPC CPUs on consumer grade devices. It’s just not true. Consoles using PowerPC CPUs are consumer devices.

http://www.flickr.com/photos/catchphotography/ H23

It’s clear that x86 and ARM will dominate consumer devices for the programming advantages.

Sweetie

The reason Apple ditched IBM is because the Cell processor came about as a result of Apple’s work with the company. It wasn’t going to use the G5 in the first place. IBM decided, during the design process, to make the architecture in order and unsuitable for a regular computer, because of Sony’s PS3. That screwed Apple, which was forced to use the power hungry and expensive G5, keeping the obsolete G4 as its mobile chip.

RIC

what did they do to improve altivec SIMD, is that also going to be 512bit like intel SIMD for servers ?

Nintendo saw the writing on the wall. x86 in next console (according to leaks on semi-accurate)

Dozerman

Anybody else here wondering what a single core variant would look and act like? They could probably ramp clocks up over 5Ghz without the other cores producing heat and that combined with the massive throughput this chip has in uncore would break single-threaded records for years to come.

Joel Hruska

It wouldn’t be as great as you think, for two principle reasons:

1). POWER8, like its predecessors, is still explicitly designed for multi-threaded workloads. That’s why it has so many execution units per core — the SMT design gives it more flexibility to keep each core fed. If you hold it to one core, you’re holding it to two threads — but IBM’s design strategy is to use threading to hide longer cache access latencies. You can’t do that if you don’t have lots of threads to play with.

2). Clock speed scaling simply isn’t high enough to overcome the loss of threads. One core would clock higher than 12 cores, but it wouldn’t clock even 2x higher. Moving all the cache structures to a fast L2 wouldn’t matter past a certain point — it wouldn’t be useful to have a 32MB L2 cache feeding just one core.

I suspect that the sweet spot for moderately parallel code would be a 4 core design clocked moderately higher than the intended 12-core, with less overall cache but faster caches — but then, since so much of the workload in this space is intrinsically parallel, you just wouldn’t see the big boosts you’d hope for.

You *could* probably push it beyond 5GHz, so there is that. ;)

Dozerman

Limited to two threads? Was that just a typo, or are all thhose other threads somehow created through the interaction between cores or something of the effect?

Joel Hruska

I’m sorry, typo on my part. One core / eight threads. But you’d still need faster caches to compensate (historically IBM’s L3 is very slow and its L2, I believe, is still slower than Intel). You can hide that latency with tons of threading.

Sweetie

The other problem is that Intel is already at a smaller process node. Lagging behind at 22nm is going to hurt such a big chip.

sebastian giacana

the kind of beast you will need to render 8k videos, to beat xeons IBM need to get a low TDP and better FP

thor42

This is *great* to see!

If ARM can hammer Intel in the mobile and embedded space, IBM can do likewise in the server space.

I would *love* to see some real-world demos of these chips.

James Tolson

this is a powerPC processor right? if AmigaOS4 supported multiple cores then an Amiga based on one of these would be Awesome :-)

davidcianorris

Go Go IBM take away those bastards and convince AMD to go with the real Big Blue for the high performance market. I don’t mention nVIDIA cause they allready have an alliance.

There are inconsistencies in the article. IBM claims this POWER8 to be up to 2x-3x as fast as a POWER7+ socket, and 2-3x faster than Intel Xeon or SPARC T5, and this article claims POWER8 is the most powerful cpu in the world. It seems IBM claims POWER8 is 2-3x faster than any other cpu.

And likewise, the SPARC T5 is 2.4x faster than the POWER7+ in real life TPC-H benchmarks, not just in theory. Here are lot of other world records, for instance here is an benchmark where a four socket T5 server is more than twice as fast as eight socket POWER7 – this means one T5 is more than four times faster than a POWER7. Sure, the POWER7+ is slightly faster than POWER7, so you can interpolate.https://blogs.oracle.com/BestPerf/entry/21030612_sparc_t5_4_tpc

The point is that POWER7 is an old cpu, IBM does not update their POWER cpus as often as x86 or SPARC (which doubles performance every other year). So when IBM claims that POWER8 is 2-3x faster than an old POWER7 – that is nothing to brag about. And when IBM also claims that POWER8 is 2-3x faster than Intel Xeon or SPARC T5 – that is not simply true. Because as we have seen above benchmarks shows the Intel and SPARC is 2-3x faster than POWER7. So if the POWER8 is 2-3x faster than POWER7 – they are all in par of each other. Hence, POWER8 has catched up the competition, and are now (hopefully) as fast as Intel Xeon and SPARC T5.

This year the SPARC T6 will arrive, which again doubles performance. And Intel will roll out Xeon E7v2 in several models. So POWER8 will again lag in performance very soon. If it not already lags.

Regarding POWER8 being 1000x faster than Intel Xeon – well I hope noone believes that. IBM claims that one big Mainframe can replace 1.500 of the x86 servers. Well, the largest Mainframes have 24 sockets. And as each socket is much much slower than a high end Intel x86 cpu (see links below) – so how can 24 slow Mainframe cpus, replace 1500 Intel x86 servers?? Well, it turned out that all x86 servers are old, Pentium3 at 800MHz and 128MB RAM, etc – and all x86 servers idle! And the Mainframe is 100% loaded. So the Mainframe can replace 1.500 idling antique x86 servers. But what happens if a few x86 servers does some work? The Mainframe will choke. So, there is no way that a Mainframe can replace 1.500 x86 servers as IBM claims. So, I really really doubt that one POWER8 is “1000x faster than x86″ cpus. Well, maybe the POWER8 is 1000x faster than a 8MHz 8086 cpu.

It runs at 5.2GHz and has ~130MB cpu cache (L1+L2+L3+L4). We dont talk about 24MB cpu cache here, no. And this is “the worlds fastest cpu”. Well, let us analyze IBMs claims on the uber fast Mainframes. The largest IBM Mainframe which is extremely expensive, has 70.000 MIPS of performance today. That is the target to beat. And it has 24 sockets. And up to 3TB RAM (Intel Xeon E7 servers has up to 12TB RAM, and SPARC M6 has up to 32TB RAM)

First of all, you can EMULATE an IBM Mainframe using “turbohercules” on a laptop. If you use an old Nehalem 8-socket x86 server you get emulated 3.200 MIPS – this corresponds to a midsized Mainframe. But software emulation is 5-10x slower than running native code, so the 8-socket server would give you 16.000 – 32.000 MIPS if you ported the software instead of emulating it. So you would need two or three 8-socket x86 servers to match 70.000 MIPS Mainframe. In total, you would need 16-24 old Nehalem cpus to match 24 of the “fastest cpu in the world”. The latest Xeon E7v2 are more than 2x faster than Nehalem – so you would need 8-12 E7 cpus to match the largest IBM Mainframe with 24 sockets. So, it seems that the IBM’s worlds fastest cpu – is not that fast, eh?http://en.wikipedia.org/wiki/Hercules_%28emulator%29#Performance

Here is another source that shows IBM Mainframes have very slow cpus (I am not talking about Mainframe I/O, I am talking about Mainframe cpu). One guy ported Linux to z/OS and could compile the same software to Mainframes and compare speed using the same software. He concluded 1MIPS equals 4MHz of x86. So 75.000 MIPS corresponds to 300.000MHz = 300GHz. The latest E7v2 cpu has 15 cores, running at 2.5GHz which equals 37.5GHz. And again you only need 10 ish E7 cpus to match the largest IBM Mainframe.http://www.mail-archive.com/linux-390@vm.marist.edu/msg18587.html

And, there is a study done by a consulting company that shows the same thing. They compare single core Xeon 900MHz and it is faster than an old Mainframe cpu z9. And we now how many percent faster z10 is than z9, and z12, and we can extrapolate and we now much faster the latest E7 is than old 900 MHz Xeon – and again we land in the same conclusion.

So, no, the Mainframe cpus are not that fast, even though IBM claims they are the worlds fastest, and they can replace 1.500 x86 servers – not true either. So when IBM claims the POWER8 is 1000x faster than the Intel Xeon E7v2 cpu (which is very good) you should not believe IBM.

The big advantage of POWER8 is that it scales to 32 sockets (probably) whereas the cheap Intel E7 scales up to 8 sockets only. So you can not scale up that much with Xeon. To do that, you need to go to Unix servers such as big POWER8 or big SPARC servers.

olagh beshoor

Well done analysis.

Joel Hruska

You make a lot of dubious assumptions about software and CPU performance scaling on limited proof. Granted, that’s not your fault — apples-to-apples benchmarks for Xeon and POWER7 configurations are hard to come by and often available only in a few tests.

That doesn’t mean you’re automatically wrong, but it’s a stronger conclusion than I would draw without far more data points.

Brett Murphy

Where to begin. You attempt to write with some authority by citing a couple of websites which have nothing to do with the article, comparing Intel and SPARC to Power7/7+ servers which was not the focus of the article and then feign disgust that IBM would have the gall to develop their own product let alone one that competes with your personal favorite.

Let’s walk through each item – that is worth mentioning.
1) Your login/handle is inappropriate. You may think it is provocative or funny to use the name of a terrorist but he is somebody that killed innocent people who did nothing to him!
2) Power8 for certain workloads is 2X per core over Power7/7+ and on a socket basis is 3X better. That is because it has 12 cores per socket compared to 8 cores with Power7. If you want to respond I will be happy to cite the chapter and verse of the benchmarks if needed. Keep in mind the initial Power8 products released are entry level 1 & 2 socket servers. IBM is comparing these 1 & 2 sockets servers to competitive 1 & 2 socket products. However, by extension because of how Power servers work which I do not think you understand given what you wrote, you actually can do some comparisons of these 1 & 2 socket servers to (larger) competitive products that have the same benchmarks available.
3) Many enterprise products are priced based on core. Thus, the per core performance is what matters, not the number of cores per socket (primarily) and definitely not performance / watt or TDP as the x86 crowd likes to focus on.
4) Power servers because of the Power Hypervisor are efficient by design. You can allocate compute resources to VM’s as needed. When resources are not used (valley in a utilization cycle) it can be allocated to other VM’s who can consume that resource to grow beyond its entitlement. So, a VM will always get it’s entitlement or QoS while often obtaining more. The max core resource for each VM is what is licensed for the software. However, as you add multiple VM’s in a Shared Processor Pool you can actually have 20X more VM’s in a Pool than cores. So, this means you could have 1 core with 20 VM’s and only have to license 1 core of software.
5) I have never lost a TCO sizing against x86 or SPARC using Oracle for example with Power servers even with the manipulation and games Oracle plays with the licensing factor of .25 and .5 favoring their servers while hindering Power with 1.0 – don’t believe me that the licensing factors are not manipulated? I’m sure everybody believes Itanium should be 1.0 and not .5 as it was before Mark Hurd left HP and joined Oracle and then HP sued Oracle…hmmmm. This point also goes to your left field comment about the mainframe. First, both mainframes and Power servers – Unix servers in general if architected probably can consolidate a lot of x86 servers – is it because of throughput, performance or both? With mainframe and Power it is for the reasons stated above – 40+ years of virtualization experience that runs as firmware intertwined or embedded with the hardware compared to the host OS that is installed. Linux on Power, AIX and IBM I have hooks for system calls and other communications between the hardware and OS that are done through the PHYP or the hypervisor so there is isolation, scale, security and reliability because of the predictability and awareness they have for one another. Until you understand the “efficiency” element you will always think it takes a 16-core Power server to compare to a 16 core competitive server.
6) It’s unfortunate that you are upset at IBM because the article said Power8 is the most powerful cpu in the world? IBM didn’t say it. Are you upset that Oracle called T5 the fastest processor in the world? I am no EMC fan but Kevin Closson has a excellent blog on this so called “fastest microprocessor” at http://kevinclosson.wordpress.com/2013/04/09/my-first-words-on-oracles-sparc-t5-processor-the-worlds-fastest-microprocessor/
7) You then go on to say that in fact Intel’s IB E7 v2 processors are actually 2X faster than Power7+ even though this article is about Power8. You cite an article from http://www.anandtech.com from Feb ’14. Ok, let’s look at the data. You accurately let the readers go to the site and see a quad core IB E7 v2 post a SAP SD 2 Tier result of roughly 135K SAPS compared to a quad core Power7 result of 93K and a quad core p270 at 68K. What you fail to state is the IB is 4 sockets with 15 cores each totaling 60 cores compared to the p460 with 4 sockets at 8 cores totaling 32 cores and the p270 with 4 sockets at 6 cores totaling 24 cores. I’m sure you meant to disclose this right? Many vendors like to make claims and let the listener be responsible to verify if what they said is actually true or that they know what they are talking about.
8) Let’s look at these results a bit further though since you want to compare Intel’s latest and greatest chip – I mean it has to be awesome because you said it is 2X faster than Power7+…wait – but there was a Power7 result with 98K – oh, but that was Power7 and not Power7+. Do you by chance have any connection to a vendor or partner of a vendor? I ask because you are working hard to promote the solution that drives up TCO to include software-licensing cost like you have a vested interest. You may not and that is ok as I am just asking if you would disclose it so everybody can consider this as they read your comments. So, that 60-core result of 135K SAPS is actually 2250 SAPS per core. Pretty impressive except the Sandy Bridge processor generated 2368 SAPS per core…but that E7 v2 processor has 15 cores per socket by golly and outperforms the previous generation E7 just because of sheer number of cores – congratulations on that accomplishment – customer loses – software companies win – genius! The 32 core Power7, which was released in March 2010, did pretty good compared to the 60 core IB E7 v2 released almost 4 years later. The 32 co P7 750 Generated ~69% of the 2014 60 co IB E7 v2 number with 53% of the 2014 IB cores…again, using 2010 technology – or an “old cpu” according to you.
9) Let’s look at a SAP S&D Tier 2 SPARC result – I’m looking now to see what has posted. I’m sure there are several 16, 24, 32 or 64 core SAP results so we can compare to the plethora of x86 entries and fair amount of Power entries. I almost fell for your slight of hands by the way. I read what and how the blog was written for the T5-4 World Records results – quite impressive (creative writing that is)! The Oracle author says the 4 processor T5 is 2.1 times faster (in June of 2013) than the 8 processor Power7 780 – incredible. I am standing in my living room applauding. But wait, as I scroll down and look at the table I see this is the first generation Power7 from Feb of 2010 with only 32 cores compared to the 64 core T5-4…. but the T5-4 was 2.1 times faster …… with 2 times the cores, twice the memory and memory to database ratio of 66% for the T5 with 2 TB Ram vs 34% for the P7 using 1 TB Ram – hold please, I am doing some math here on my calculator because something doesn’t add up. 4 P7 (hmmm, head scratching) 4 = 64 while 8 = 32….. I’ll let the readers determine if you meant to mislead them along with the blog results posted by the Oracle master marketing machine or if in your exuberance you typed faster than you thought about it.
10) Back to that 2 Tier SAP result – Wow, do you know there are just 4 Oracle entries? 1 for an x86 server, one for a 384 core M6-32, 1 x 192 core M5-32 and 1 x 128 core T5-8 server. That’s weird – nothing for anything smaller. It’s almost like Oracle didn’t want to show how the T processor scaled from 32 to 64 to 128 but instead posted a result for the largest server configuration for each model. Now the poor seller or customer has to extrapolate that all smaller configurations would be divisible by that number of cores and sockets. – thus linear. I’m sure it is (linear), would you agree? Let’s look at them.
The M6-32 delivers 384 cores = 793,930 SAPS and 140,000 OR 2067 SAPS per core and 364 users per core. The M5-32 delivers 472,600 SAPS and 85,050 users OR 2461 SAPS per core or 442 users per core. The T5-8 delivers 220,950 SAPS and 40,000 users OR 1726 SAPS per core and 312 users per core. Interesting results. I would have expected the M6 to deliver the greatest result followed by the M5 but wow – the T5 was down 30% from the M5 per core result. Looking for other SAP benchmark results, I see a Sun result from 2009 with the T5440 with 4 sockets and 32 cores that is a Oct 2008 era server or 1.5 years before the 4 socket Power7 result you used. It delivered just 25,830 SAPS and 4720 users. I can see why Sun was so leery in publishing many benchmarks – yuck. I do not see any SAP S&D 2 Tier T3 or T4 results at all. There are T1/T2 results from Sun and the large Oracle T5 results. I could check other benchmarks and I’ll give them the benefit of the doubt that there are more for SPARC. Without question x86 publishes many benchmarks – of course, most of them do not include virtualization like VMware (unless it is a virtualization specific or focused benchmark) so that should always be taken into consideration.
11) Just to be consistent, here is the list of Power results. a) 96 core 780 with 311,720 SAPS and 57,024 users OR 3247 SAPS per core and 594 users per core b) 16 core 7R2 (Linux only) with 45,150 SAPS and 8,256 users OR 2821 SAPS per core and 516 users per core c) 48 core 760 Power7+ with 139,220 SAPS and 25,488 users OR 2900 SAPS per core and 531 users per core d) 16 core p260 (Power7+ Flex node) with 54,700 SAPS or 10,000 users OR 3428 SAPS per core and 625 users per core e) 128 core 795 with 384,330 SAPS and 70,032 users OR 3002 SAPS per core and 547 users per core f) 256 core 795 with 688,630 SAPS and 126,063 users OR 2689 SAPS per core 492 users per core g) 64 core 780 with 202,180 and 37,000 users OR 3159 SAPS per core and 578 users per core h) 12 core 730 with 38,520 SAPS and 7000 users OR 3210 SAPS per core and 583 users per core i) AND of course the latest the 24 core Power8 S824 server with 115,870 SAPS and 21,212 users OR 4827 SAPS per core and 884 users per core. I skipped several, as the list would be more extensive than it already is as it just repeats itself. I’ve made the point that IBM has posted not only many results but for various configurations to give the reader different data points using different configurations of software and hardware. The performance results for this “old cpu” as you call it are pretty impressive considering they beat every x86 and SPARC result I could find when you calculate by the core which IS what matters. When you do not have an efficient server then you stuff it with as many engines as you can to get scale and either the unsuspecting customer or the customer who only buys John Deere green, they don’t care as they wouldn’t buy a Case International “Red” solution if it was the best one on earth – tried to tie in a metaphor for those who can appreciate the JD green or Ford vs Chevy loyalty examples.
12) Seriously, you don’t think a 24-core result of 115,870 SAPS isn’t something to brag about? It is within15% of the highest x86 60 core result and even better than the NEC E7 v2 result. You keep telling yourself that Power8 is just catching up and E7 v2 is kicking it’s butt…and of course, Haswell and Broadwell – OMG – oh the terror! What do you think will happen when IBM delivers Power8 in a 4 socket and larger configuration? If a 2 socket with 24 cores matches the latest and greatest 4 socket 60 core – could it be a 4 socket 48 core Power8 might generate a SAP result around 230K? Maybe a 8 socket with 96 cores close to 500K? How about a 16 socket with 192 cores – possibly 1M SAPS? I don’t know but I just showed you a range of Power7, Power7+ and the entry level Power8 SAP results. Unless Haswell and Broadwell are doing something revolutionary instead of the typical evolutionary that has been the case since Westmere I do not see them keeping pace with Power– one might argue the introduction of Nehalem as the last “revolutionary” release of x86 technology.
13) Why should IBM refresh their chips as often as Intel? What Intel is doing is driving sales and a refresh cycle. With few exceptions such as the introduction of QPI with Nehalem the benefits of each Tick and Tock are not that impressive if you remove the marketing and hype. Yes, they usually turn the crank and advance the ball a peg or square but as I’ve shown they also go sideways. With enterprise customers they do not want to deploy a solution just to turn around and touch it again in 2.5 years. They would like to see it run for 5 – 7 years. I have many Power customers who ONLY upgrade because of the financial benefits of reducing maintenance, software and their solution costs as price per performance improves. They do not need to do it because of performance or reliability. It isn’t uncommon to see Power customers remain on the same server for 10 or more years, especially IBM I customers.
14) You mention the SPARC T6 coming this year – I hope it does because competition drives innovation among all vendors but I won’t hold my breath. Sun and Oracle have a history of delays. Some are to be expected and a few months here and there are no big deal. Let’s see if they deliver and if it is anything of consequence – you set the bar high stating it will be 2X the performance. I know that Power delivers increased performance per core, which translates into greater performance per socket. Add more cores to the socket and you get even greater performance. What I have observed from SPARC and x86 is what they call 2X the performance is the simple statement of going from 4 cores to 8 cores – nothing more and nothing less. I call this disingenuous at best but it goes back to the reader and customer having a responsibility to do their homework as well.
15) With regard to Power8 being up to 1000 times faster. Did you read any of those specific claims or are you just throwing up your hands in disgust and calling BS? I get it, it’s reasonable….just curious if you read any of the actual claims. You might want to. When you do, you will see that the software such as IBM’s DB2, Cognos, BigInsights, Streams and other is optimized for Power servers. It knows how to take advantage of the larger L1, L2, L3 and L4 caches, increased cpu registers, more SIMD, 4X the threads (ie SMT8), special data types, encryption accelerators for decimal floating point, encryption and memory encryption plus memory bandwidth per socket up to 200 GB/s and I/O for a 2 socket server approaching 200 GB/s. Lastly, CAPI – did I mention CAPI? CAPI essentially shares a PCIe3 slot and allows special adapters like flash memory cards, fibre adapters or GPU’s to connect and run on the system bus at core speed – note that I did not say it had to traverse any PCIe stack to do this so very, very efficient but that is the best word to sum up a Power server – efficient.

I wrote a lot and hope it helps you and the other readers appreciate how Power servers are actually more advanced than anything else on the market. I encourage you to take a look at them for your Unix and Linux workloads. In light of the recent security attacks on Target, eBay and others, other than the mainframe there is not a more secure platform than Power so why not virtualize everything, reduce cost, increase reliability and performance today? Also, some of my comments may be a bit flip towards you or others who favor x86 and SPARC technologies. I apologize in that it is easy to make this like a basketball game and root for my team rather earnestly. Although I stand behind what I have said and it is correct, my intent is to state the above in a spirit of helpfulness and clarity. Kind regards.

UsamaBinLaden123

What a post. I started to read, but no, that wall of text was too much for my eyes. And lot of it was irrelevant too. I stopped reading quite soon. Next time, make it easier to read, please? Structure it better? Anyway, I see a recurrent theme in question 3) in most IBM’ers posts, so I will answer that:

1) My nick name does not invalidate my posts nor the benchmarks I present.

“…3) Many enterprise products are priced based on core. Thus, the
per core performance is what matters, not the number of cores per socket
(primarily) and definitely not performance / watt or TDP as the x86
crowd likes to focus on…”

Answer: No one cares about how many cores a cpu has, nor how many threads it has – when we discuss “the worlds fastest cpu”. It was another thing if IBM claimed they had released “worlds fastest core” – but no, IBM has not done that. IBM claims they have the “worlds fastest cpu” – so we need to discuss the cpu.

For instance, you link to a post where they discuss SPARC T5, and they say “sure, T5 is fastest, but it has so many cores but core per core IBM is faster!” – so what if IBM has faster cores? IBM claims they have the “worlds fastest cpu” – so why do IBM shift focus away from cpus, to how many cores T5 has?

Another IBM blogger claimed that POWER7 was still the worlds fastest cpu because: “POWER7 has faster core in this benchmark XYZ, and therefore, the POWER7 is a faster cpu so Oracle lies about T5 being faster than POWER7!!!”.

Isn’t it a bit strange when someone concludes:

POWER7 has faster cores (might be true in that particular benchmark) -> POWER7 is a faster cpu (false).

If POWER7 has two strong cores, and SPARC T5 has 10.000 slightly weaker cores – does this mean the POWER7 is a faster cpu, just because it’s cores are faster? Do IBM claim they have the fastest cpus in the world, or fastest cores in the world?

It is like IBM POWER6 runs at 5GHz, and still you need 14 (fourteen) POWER6 cpus (in three P570 servers) to match four SPARC T2+ running at 1.6GHz (in one T5440 server) in official Siebel v7 benchmarks. Back in those days, Oracle did not own Sun. Sun was way faster back then, in Siebel and is faster today too now that Oracle owns Sun. Yes I read that IBMers objected on the Siebel benchmarks because Oracle owns SPARC nowadays, but hey – Oracle did not own SPARC back then when Sun crushed IBM.

Because SPARC T2+ was way faster than P6 in Siebel v7 – could Sun have claimed “clock for clock T2+ is much faster than P6 and therefore the T2+ is a faster cpu”? No.

You can not say that a cpu does more work clock for clock and therefore is the faster cpu. Neither can you say that a cpu has faster core and therefore is the faster cpu.

And besides the SPARC T5 is often twice as fast as POWER7 or more, in benchmarks so T5’s cores are also faster than POWER7 cores.

Brett Murphy

I am not an IBMer but somebody who appreciates technology and the value of it. Power is the premier RISC platform if you want to run Unix workloads. It’s the only one for IBM i, which is itself an incredible integrated offering that is solution based – truly the “Set it and forget it”. Linux as an OS makes x86 better, Power makes Linux better and Linux on Z make both better because it makes the most secure and reliable platform on the planet more affordable and open. By the way, IBM’s System x is quite impressive as well. Unfortunately, customers in that space are more interested in buying the least expensive commodity server regardless of features thinking that VMware is like caulk and can fill any gaps vs buying a quality x86 solution full of advanced RAS, memory and scaling features.

IBM has not “claimed” to have the world’s fastest cpu”. Would you cite the webpage or document? Also, this is a bit of semantics and you are not alone in being confused. IBM calls the chip that goes in a socket, a processor. They may generically refer to it as a “cpu” but generally it is a processor. The core is a core but, in the IBM world they also tend to refer to a core as a “cpu”. This is where Sun in the past would play word games with their T series processors in particular – Sun’s 8 cpu T server beat IBM’s 8 cpu server. When in fact, Sun was saying that was 8 sockets of 4, 8 or 16 cores vs a Power server made up of 8 cpu’s or 8 cores. If you want to dispute this I will grab an example and post in a response.

Skipping past your rant I will summarize the core vs socket and even system performance. When software is priced by the core, it doesn’t matter if there are 12 or 12,000 cores in the server if the 12 cores are much more powerful than the 12,000. There might be to some consumers where the 12,000 low performing cores look impressive, particularly if they look at the sum of performance. However, when the software cost for that solution closes in on the debt of California it matters if there are 12 or 12,000!

Ah ha – you just did what I said above in my example of what Sun would do – you and I must have worked together at Sun….are you now at Oracle? I was on the technical side which is why “facts” matter to me. You are attempting to compare “four” T2+ cpu’s in a T5440 vs 14 Power6 570 cpu’s. Let the readers see for themselves where you destroy your own credibility. The 14 Power6 cpu’s are just that 14 cores. That is made up of 7 x dual-core chip modules meaning there were 4 CEC’s or 4 Central Electronic Complexes. Each CEC would have 4 cores in each or 2 x DCM. 4 CEC’s = 16 cores. Since cores were activated 1 at a time with a minimum to start with there could be 14 active or they just tested with 14 – who know’s. The T5540 with 4 cpu’s as you describe though is actually 8 cores per processor so 4 Sun cpu’s = 32 cores. Hmmm, 32 SPARC T2 cores vs 14 Power6 cores. The benchmark speaks for itself – I won’t contest it. I will say that core for core my P6 will beat your T2 just like today – beat you like a drum!

Not only can I say it, the data backs it up. Not only does the data back it up but not only do I as a business partner prove it every day to customers as we continue our march to take out the remnants of the SPARC install and now focus on the x86 battle. This isn’t hyperbole. You (the reader) do not have to contact me for this – contact your local BP or IBM resource. If you are a customer running x86 or SPARC and want to compare it to Power8, I will happily (IBM will also) bring in a Power8 server and we will prove it to you – if we have to do it one customer at a time we will. The naysayers like this guy who only mis-state and confuse will always be around as they like being provocative or they compete and consider spreading FUD to be as effective as developing better technology. They are trying to protect their install base of either hardware or software. You may think the same of me but that is why I take the time to explain the technology, how you benefit, how it is different and how I am willing to demonstrate the technology to you. Like I said, don’t call me if you think I’m selling – I’m a Enterprise Architect who is passionate and writing on my own.

UsamaBinLaden123

Thanks for structuring your text better. I dont work at Oracle, I work at a investment bank doing HFT research.

“…Hmmm, 32 SPARC T2 cores vs 14 Power6 cores. The benchmark speaks for
itself – I won’t contest it. I will say that core for core my P6 will
beat your T2 just like today – beat you like a drum!…”

OTOH, I might choose to focus on GHz. Then we see that 14 (fourteen) POWER6 cpus running at 4.7GHz needed to consume 65.8GHz in total, to match four SPARC T2+ cpus. The SPARC T2+ cpus consumed 6.4GHz in total. Besides that more GHz equals more wattage, one could say that the POWER6 was awfully inefficient because you need 10x more GHz to do the same work as SPARC T2+ do. One architecture is 10x more inefficient than the other. Is that something to brag about, or what?

So, what should we focus on, when we compare and try to see which cpu is the fastest? Should we talk about which cpu has the fastest core? Or which cpu do more work for each GHz? Or should we talk about which cpu has the fastest ALU? Or register? Or bandwidth? Or… what?

Or should we just focus on which is the fastest cpu? Isnt that simpler, especially when we talk about which cpu is the fastest? And no matter what you say, the SPARC T5 is faster than the POWER7 in most(all?) benchmarks: SpecINT2006, TPC-C, TPC-H, etc.

Sure, you can try to talk about fastest core, or most efficient GHz, or fastest ALU, or whatever. But the bottom line is this: SPARC T5 is faster cpu than POWER. End of story.

Which actually happens to be much slower than a decent x86 cpu. The z192 cpu offers probably half or third of the performance of a decent high end x86 cpu. See my links in my previous post above.

And IBM claims that one IBM Mainframe z12 could replace 1.500 x86 servers – is this credible? Especially when you know how much slower the Mainframe cpus really are? Are you surprised that IBM claims to have the worlds fastest cpus?

I remember when IBM claimed that one POWER7 server could replace hundreds of x86 servers – all of them were Pentium 3 running at 600MHz or so, with 128 MB RAM – and all x86 servers were idling.

I am not surprised when IBM claims that a POWER8 server is 1000x faster than a x86 server. Probably they compare to a 4.77MHz 8086 server with 64KB RAM.

http://www.gaminglaptopsjunky.com/ Junky

does anyone think this is a true open source project? IBM will become Intel if they could, in the server CPU market
Plus, Intel doesn’t just waiting for a competitor to snatch it all

runner

I hope that PC/consumer market will benefit from this ibm endeavor at the end. seems like a good platform also for future of PCs – many threads, wide bus, since intel is going nowhere with x86 last couple of years and it is time for a change. I would definitely change my old PC with something more capable than it is currently (reselling) intel

http://gamezarchive.blogspot.com/ Mickie James

this is the awesome news for me

POSSIBLE

Can anyone please help with power 8 processor capacity compared to X86 intel processor capacity. possibly in matrix form. I’ll appreciate this a lot.

Use of this site is governed by our Terms of Use and Privacy Policy. Copyright 1996-2015 Ziff Davis, LLC.PCMag Digital Group All Rights Reserved. ExtremeTech is a registered trademark of Ziff Davis, LLC. Reproduction in whole or in part in any form or medium without express written permission of Ziff Davis, LLC. is prohibited.