Why IBM needs ETL

Two reasons

Comment IBM needs to develop a modern, fully-functional ETL capability - as opposed to the relatively limited capabilities that are currently provided by Warehouse Manager. Or it needs to buy one. There are two reasons why.

The first is because arch-rival Oracle has one. Worse, Oracle Warehouse Builder, in its latest incarnation, is pretty good. Also, Microsoft is doing a lot to expand its capabilities in this area, in the forthcoming Yukon release of SQL Server. However, it is Oracle that is the driver for IBM (if you see what I mean).

It is fair to say that ETL in the database is useful only up to a point. It probably isn't the best approach if you have heterogeneous databases across your organisation and, in any case, not all data integration tasks are just about moving data into a database. So, taken in isolation, IBM might have a reasonable argument in maintaining that it will continue to rely on partners, such as Ascential and Informatica, for ETL.

However, it is no longer feasible to treat ETL in isolation, which brings me on to my second reason. I have been maundering on about the synergies that exist between ETL and data federation (or EII) for some time, suggesting that there is so much correlation between these two technologies that it makes sense to have a single engine that supports both. But that contention was not borne out by any major players within the market – until now.

Recently, both Sunopsis and Informatica have announced that they will be supporting both ETL and data federation (and more in the case of Sunopsis) directly from a single platform.

So, now let's turn this argument round: IBM is the major player in the data federation space with WebSphere (formerly called DB2 but still actually a DB2 product) Information Integrator. The question arises as to whether IBM can maintain this leadership position once there are other significant vendors (which there aren't – much – today) in the market that are offering a broader range of capability? The obvious rebuttal from IBM is for it to say that it offers broader capability, in the sense that it offers content as well as data integration capabilities. But Informatica is also planning to be able to move documents and other so-called unstructured data around. So that response won't wash.

The bottom line, from my perspective, is that IBM needs to be able to offer ETL as a component within WebSphere Information Integrator, as a separately licensed module, just as Replication is today. It might be billed as Warehouse Manager, but it needs to share technology with WebSphere Information Integrator (for example, a single transformation engine) if it is to be credible.

Fortunately, IBM has time on its side: Informatica, for example, will not be completing the expansion of its products into data federation until 2006 (though Sunopsis is this year), but IBM should be seriously considering what its plans are, right now.