Unless you've been living on a desert island, you can't have failed to have noticed that the current hot topic is "Big Data". It even featured in a program on BBC Radio 4, which must the first for an IT topic !!!

Over the last couple of months the social media output from IBM on this topic has been intense. Not a day would go by without at least on blog / tweet / podcast / webcast on the topic. It was all obviously leading up to something big. And that "something" happened last week, with a briefing at IBM's Almaden facility near San Jose, California announcing a range of offerings based around IBM research called "BLU" (without the "E").

While the announcements covered a number of products the one of most interest to me was that involving DB2 10.5 with BLU Acceleration built-in. Not that this was news to me : those of us involved with the DB2 Customer Advisory Council have been up to speed with this development for quite some time. It's only now that we can talk about it.

I'm not going to say a lot about the technology itself. It is covered by many other articles and blogs. But in essence we are being given a columnar data store integrated into the core DB2 engine, with potentially huge benefits for Business Intelligence workloads. I'm trying to get to grips with the technical details of the software at the moment through the DB2 Early Access Program, and I'll no doubt have more to say on it later.

What does interest me is that this continues a trend of bringing new data storage options into the DB2 engine. This is in contrast to standalone niche products for each storage option. Over the years we have seen standalone object databases, XML data stores and OLAP cubes. The issue with all of these was that, despite the efforts of some vendors to convince us otherwise, not all data was best handled in one particular storage format. So over time these standalone products became niche offerings and the (originally) relational DBMSes became hybrids object/relational/XML/OLAP engines. More recently RDF triple stores and now columnar data stores have joined the party.

Its all about choice ... or "horses for courses". As database professionals we have to learn which option is best for any particular situation. We have to learn to walk the middle ground, embracing all the options. Too often I've seen folks losing out by going off in one of two directions, particularly when pureXML was introduced. One group basically rejected XML storage for anything (for a variety of reasons ranging from the fear of learning new stuff to a belief that the relational model was the "one true way"). Another group wanted to store everything in XML. Those who got the most benefit from pureXML were those who learnt when to use it and when to use something else. IBM were a bit slow with coming up with guidelines of this nature, but recently there have been a number of good sessions on this topic.

I think it will be the same with BLU. Its obvious that not everything is right for BLU. But some things are absolutely right for it. Getting it right will be our challenge.

The other comment I'd make at this time is that DB2 10.5 is not just BLU. There is more great new functionality in there as well, some of which I believe is more generally useful than BLU. So don't forget to check out the rest of the story. In particular, the pureScale / HADR combo is a real worldbeater. I'll try to write on some of this shortly.