IBM considers System X surgery: Will it ruin its sexy HPC figure?

A full X-ectomomy might, a partial could be OK

HPC blog The increasing amount of speculation over IBM's potential unloading of all or part of its System x (x86-based servers) business onto Lenovo has high performance computing (HPC) players wondering about the implications for the sector. What would be the eventual impact on IBM, and the HPC market in general, if Big Blue were to dump the x86 end of its system business?

These systems are big business, albeit at margins ranging from low to extremely low. As our pal TPM points out here, IBM has never been good at low-margin lines of business.

However, IBM is pretty good at HPC, and HPC is a market where bunches of low-cost boxes are combined to build the largest, and some of the most expensive, systems in the world. And the innovations that drive HPC performance eventually find their way into enterprise and even consumer tech products.

It’s impossible to find good numbers on exactly how many HPC systems and individual models are in use today. In fact, it’s increasingly difficult to differentiate an HPC workload from an enterprise compute-intensive or Big Data workload. (I’d argue that they’re the same thing, with the only difference being the data they’re analysing and the questions they’re answering.)

Since I couldn’t get solid numbers for analysis of the entire market, I worked over the latest Top500 list to see what’s what at the high end of HPC.

Top of the flops

As a system vendor, IBM has more systems (193, or 39 per cent) on the Top500 list than any of its competitors. HP is in second place with 146 systems for a 29 per cent share. IBM’s lead gets larger when you look at performance. The total petaflops of IBM’s systems on the list equals 66.216: Cray’s second place showing is just 28.189 petaflops.

But these numbers for IBM include both its homegrown BlueGene, PowerXCell (Cell BE), and POWER 6/7-based systems along with IBM systems based on x86 processors. What do the share numbers look like when we cut for processor architecture?

Here's a blinding glimpse of the obvious: Intel has a pretty solid position in the HPC world these days. Close to 80 per cent of the Top500 systems on the list are fueled by Intel processors. If you add AMD’s smallish system share, you find that the x86 architecture accounts for 88 per cent of the largest systems in the world.

The next largest chunk is IBM’s non-x86 systems. These include IBM’s BlueGene systems, POWER6 and POWER7-based boxes, and its two big x86/Cell hybrids (Roadrunner and Cerrillos).

It could be argued that Roadrunner and Cerrillos shouldn’t be included in this category because they use Opteron processors in addition to IBM’s Cell accelerators. To me, the complexity of these systems and their use of the semi-exotic Cell processor make them quite a bit different than any other traditional x86 cluster, even one with GPUs, so I’m putting them in the ‘non-x86’ bucket.

The balance changes when we look at performance share of systems on the latest Top500 list.

Total "performance" in gigaflops (as measured in TOP500 metric "Rmax" - ie, what the cluster can attain when it operates with optimal data) on the November 2012 Top500 list was 162,139,387 gigaflops. Breaking that number down by processor type, then bundling the results into the categories on the chart tells us a somewhat different story than is shown by the system share chart above.

Intel and the x86 architecture accounts for 64 per cent of total performance on the list. What this really tells us is that the non-x86 systems are very large. IBM only has 53 non-x86 systems on the list, but they account for almost 30 per cent of total performance.

These systems include IBM’s BlueGene boxes, which hold down the second, fourth, fifth, and ninth slots in the top 10, and 13 out of the top 50 systems, along with IBM’s Roadrunner Cell and other POWER processor-based system.

IBM has a large presence on the non-x86 side of the Top500, but how does this compare to IBM’s total Top500 installations? In other words, how much x86 is IBM selling into the top HPC data centres in the world? Fortunately, I have more charts to share.

As a share of IBM’s total sales to the Top500, typical x86 boxes are the largest segment in terms of system count at 33 per cent. These are IBM’s M3/M4 and other traditional rackmount "pizza box" systems. The vast majority are dual-socket Xeon-based servers that can be found in pretty much every data centre in the world.

Interestingly enough, 43 out of 63 of these systems reside in China and are used by Internet service providers. So if some sort of IBM-Lenovo asset sales takes place, Lenovo will have a home court advantage with these customers – and increase its presence on the Top500 list from a single system to 40+.

We can also see that IBM’s iDataPlex accounts for 31 per cent of the systems it has sold into the Top500. These systems were originally aimed at net service providers who would be attracted by their no-frill, high-density, low-power, low-cost design.

However, they have caught on with HPC customers who need exactly these same attributes. Adding support for GPU accelerators has also helped turn iDataPlex into a more performance-oriented system.

Systems based on the BladeCenter infrastructure round out the rest of IBM’s x86 offerings, accounting for 11 per cent of IBM Top500 installations, primarily showing up in the lower third of the current list.

IBM’s non-x86 BlueGene, Cell, and POWER processor-based systems account for only 26 per cent of IBM’s system share on the latest Top500. The BlueGene systems have very low sales volume (34 systems on the list) but are also the most expensive machines for IBM to design and manufacture. IBM’s POWER-based systems are essentially very close to its commercial Power System Unix offerings, so development and manufacturing costs are spread out over a much wider base of customers.

When we look at IBM’s product segments based on performance, the picture changes radically.

Garden-variety x86 systems that made up a third of IBM’s Top500 sales are now a much smaller (9 per cent) slice of the pie when it comes to performance.

IBM’s BlueGene systems account for more than 60 per cent of the Big Blue-provided gigaflops in 2012. Adding in the POWER-based systems, we see that IBM’s non-x86 boxes drive 70 per cent of its installed gigaflops on the Top500 list.

On the x86 front, iDataPlex is the largest system family segment when it comes to performance (at 18 per cent). BladeCenter installations were mid-sized, with an average ranking of 220 when they first hit the chart. As they’ve aged, the average ranking has slid to 389. We see the same pattern systems in the rackmount "Xeon M3/M4/other" bucket, when first installed, the average ranking of these systems was 273, now the average is 367.

Implications of IBM-Lenovo X-ectomomy

If we are talking about a full X-ectomomy, with IBM selling its entire x86 business to Lenovo, the implications are somewhat grim for future IBM HPC business. IBM would lose almost half its installations on the Top500 list, but only 30 per cent of the aggregate performance – meaning it would lose a lot of smallish system footprints. (“Smallish” in terms of Top500 – which is large in any other context.)

On the enterprise side of the house, it could be argued that IBM would still be in roughly the same competitive position, since part of any deal would almost certainly include Lenovo giving IBM "most favoured trading status" when it comes to reselling Lenovo’s new server line. Lenovo would produce the servers much more cheaply than IBM and would pass those savings onto Big Blue.

But this might not wash when it comes to HPC deals with large governmental organisations. Many of these organisations will not buy from a foreign manufacturer, period. They have a number of reasons for this which probably include security concerns, a greater ability to regulate domestic vs foreign vendors, public relations and other considerations that I’ve probably missed.

I would (and have) argued that the bulk of the guts of every system today is manufactured overseas – primarily in Asia. Domestic vendors today mostly assemble systems, putting the parts together, slapping on an operating system, testing it, then shipping it out to the customer. In some cases, even for supercomputers on the Top500 list, systems are fully assembled in Asia and shipped as a whole unit to the end user. Given this, does it really make any difference what name is on the case?

What's in a name? The awarding of MASSIVE government contracts

It turns out that it does. I’ve heard through the HPC grapevine that several large US government customers dropped Thinkpads when Lenovo purchased IBM’s PC business. My sources anticipate the same thing if Lenovo takes over all or part of IBM’s System x product line. But this only applies to IBM’s US government business, not to the rest of its HPC customer set or to enterprise customers. So while a full X-ectomomy would have a significant impact on Big Blue in HPC, it’s not a killing blow.

I think the most likely case is that IBM does a partial X-ectomomy, selling off the low end of System x while keeping BladeCenter and iDataPlex. If this is the case, then IBM will lose some spots on the Top500 list and probably some customers, but these would be Big Blue's smallest systems in the most competitive (meaning low margin) market segment.

IBM’s iDataPlex and BladeCenter offerings have the flexibility and configurations to handle soup-to-nuts x86 implementations. While most of these installations are mid-sized, the SuperMUC iDataPlex system is currently #4 on the Top500 and the first petaflop system, Roadrunner, was built on a BladeCenter chassis. So these boxes can certainly scale when needed.

We’ll see what happens in the fullness of time, but no one is denying that IBM and Lenovo are talking about a sale. All signs point to a late May announcement, with a 1 June change of control. ®