IBM throws DB2 Power cluster at Ellison's Exadata

The war of words and technology between IBM and Oracle will get a little warmer today as Big Blue launches its DB2 PureScale clustering technology for its DB2 database and Power Systems Unix servers.

As El Regreported earlier this week, IBM's database and server techies have been cooking up a clustered DB2 database implementation on IBM's AIX-based Power servers to steal some thunder from Oracle and its minion, Sun Microsystems, at Oracle's OpenWorld customer event next week in San Francisco, California.

Oracle is expected to roll out some sort of Sparc-based Solaris system at the event, most likely a cluster of Sun's T5440 servers running Oracle and Solaris and very likely resembling the Exadata V2 x64-Linux database cluster that the pair announced a month ago.

The basic feeds and speeds of the DB2 PureScale offering are exactly what El Reg had heard through the grapevine, except that unlike the Exadata V2 setup from Oracle and Sun and unlike IBM's own Smart Analytics System for data warehousing, which debuted in late July and which you can get the feeds and speeds of here, DB2 PureScale is not a preconfigured system.

Rather, it is a database clustering feature that is only being made available on AIX running on IBM's Power Systems iron, and specifically only on its midrange Power 550 machines and its high-end Power 595 servers. The Smart Analytics System is a cluster of Power 550 machines tuned to do data warehousing that runs the AIX operating system, the DB2 database, various Cognos data warehousing products, and soon SPSS analytics now that IBM has acquired SPSS for $1.2bn.

Oracle is sure to dig IBM about the fact that DB2 PureScale is not a system, but something that has to be configured (presumably by IBM Global Services), while the Smart Analytics System is a preconfigured box, ready to go to run data warehouses and their analytics software, as is Oracle's Exadata V2.

IBM seems to have gotten the integrated system down with one offering, but not with the other, which is a bit peculiar. According to Bernie Spang, director of product strategy for the information management division of IBM's software group, though, this is just facing up to the reality that one size does not fit all when it comes to servers, storage, and InfiniBand switching when it comes to online transaction processing.

Oracle will no doubt pick on IBM for having one set of stuff for OLTP on database clusters - DB2 PureScale on Power iron running AIX - and another for data warehousing - the Smart Analytics System, which shares many components but which can be quite different.

DB2 PureScale, says IBM, does what many companies have always wanted to do: allow a clustered database to look like it is running on a giant symmetric multiprocessing server, where the clustering is in the chips and chipset and the processor cores share memory. With many parallel database implementations, you have to carve up the datasets and spread them out across database nodes, or you have to tweak your applications so they can run on a parallelized database.

No tuning

Both IBM with PureScale and Oracle with the Real Application Clusters extensions to the past several iterations of the eponymous Oracle database, say they have this problem licked. Moreover, IBM says that customers will not have to go through a lot of database tuning to make PureScale work. And finally, because it is a clustered database, high availability is built in. One database node goes down, another takes over its work.

The DB2 PureScale feature was co-developed by IBM's database software engineers in its Toronto, Ontario software labs and the Power Systems and AIX development lab in Austin, Texas.

The database feature is being branded with the Power HA (short for high availability) brand, which comes out of the Power Systems division within IBM's Systems and Technology Group. The Power HA software stack includes what used to be called HACMP clustering (now known as Power HA for AIX) and High Availability Storage Manager (now Power HA for i, referring to the proprietary i/OS operating system for Power boxes).

For whatever reason, IBM has not rebranded the iCluster HA clustering product it gained through its $161m acquisition of DataMirror in July 2007 with the Power HA moniker. The important thing that Spang wants people to realize is that the Power HA DB2 PureScale feature - that rolls right off the tongue, eh? - is a new clustering technology and is not based on any of these clustering products.

As we explained earlier this week, DB2 PureScale makes use of InfiniBand clustering to link multiple server nodes equipped with AIX and DB2 together. The secret sauce is that it has a designated server in a cluster - which functions similar to a head node in a parallel supercomputing cluster - to manage the locking of database fields as transactions are processed, and the locking and unlocking of memory in all of the nodes in the cluster as they seek information from each other as part of the OLTP cranking.

Because the server nodes are linked to each other and to the master PureScale node through the Remote Direct Memory Access (RDMA) features of InfiniBand, the processors are basically cut out of the networking stack, unlike TCP/IP clustering techniques. The central caching server is mirrored so it is not a single point of failure, and radically cuts down on the intra-node communications that normally happen in a parallel database implementation, according to Spang.

Interestingly, the server nodes in a PureScale setup all link to each other through the 12X remote I/O port on the Power Systems servers. The 12X I/O port is a variant of double-data rate InfiniBand that IBM has tweaked to allow remote I/O drawers (laden with disk controllers, disk drives, tape drives, and so on) to be lashed back to central servers.

A decade ago, IBM took a variant of similar remote I/O drawer technology called OptiConnect and a parallel database clustering technology called DB2 MultiSystem to create a parallel cluster of AS/400 proprietary minis that presented a single database view to applications, even though it was running on a cluster. (Hmmm.)

DB2 MultiSystem generated about as many customers as press releases, and IBM stopped talking about it. We'll see if this doesn't happen again with PureScale.

For now, according to Scott Handy, vice president of marketing and strategy for the Power Systems division, IBM is only offering the PureScale feature of DB2 on Power 550 servers, which have from two to eight Power6+ cores in the most recent iterations, and on Power 595 boxes, which have from eight to 64 Power6 cores running at 5GHz. IBM is not offering a blade configuration, which seems odd.

The PureScale feature is only being enabled on the AIX 6.1 operating system and only with the DB2 V9.7 database, and Handy wouldn't say what IBM's plans were for Linux, Windows, or i/OS. Both Linux and i/OS run on Power iron, and DB2 V9.7 runs on Linux and Windows operating systems, so it would seem that Linux should be a snap to support and putting PureScale on i/OS's own implementation of DB2 and enabling it for Windows running DB2 V9.7 would be relatively easy.

According sources that spoke to El Reg who are familiar with IBM's plans, PureScale is going to be ported to Windows and Linux using DB2 databases, but will not be ported to the HP-UX or Solaris versions of DB2.

IBM is being fairly tentative - at least publicly - about its plans for the PureScale clustering technology not just because clusters that look and feel like SMP boxes are relatively new to end users, but no doubt because it is worried about the potential impact clusters will have on the sale of SMP iron.

It has always been cheaper to build and buy a cluster of cheaper servers than an SMP box of equivalent raw oomph, but the cluster administration and application and data tweaking was a costly nightmare. If what IBM says about PureScale is true - that it can scale linearly up to a hundred server nodes or more - then PureScale could take a chunk out of the big Unix and mainframe boxes IBM sells. And equally frightening for the Big Blue sales team is the prospect that customers stop doubling up on their servers for high availability clustering.

Go for configurations

Of course, if IBM doesn't sell the database clusters, Oracle and Sun surely will try. And at many accounts, they may very well succeed in pitching against IBM's big iron, even with PureScale in the mix. It all depends on the bang for the buck and the ease of management.

Spang would not divulge IBM's plans to run benchmark tests on a PureScale setup, but it seems likely that Oracle is going to put Sun's x64 and Sparc clusters through the paces and hammer on IBM.

The DB2 PureScale feature will be available in December. Pricing will be released at that time. All of the AIX iron that it requires is available now. Presumably IBM will cook up some sample configurations for customers to look at as they buy components, and if IBM was smart, it would put together a few configurations with a single product number, as Oracle has done with the Exadata V2 and as IBM has done with the Smart Analytics System. ®