Data Virtualization is Far from a Pipedream

Bank of America, Comcast, and other shops are using data virtualization to support applications, services, and initiatives.

By Stephen Swoyer

11/29/2011

To listen to Ted Hills tell it, data virtualization -- DV, for short -- is far from a pipe dream. In fact, said Ted Hills, an architecture executive with financial services giant Bank of America, it's rather more of a lead-pipe cinch.

According to Hills, BoA's use of DV technology from Composite Software Inc. helps enable real-time information access, and in 2011, Hills argued, there's no excuse not to be doing things in real-time. If you're going to do real-time right, he says, you need a DV layer.

"It's the 21st Century, a 2-GHz processor is standard issue ... and we're going to wait until 6 pm to [do a batch process]? It just doesn't make sense. We really need new thinking," Hills told attendees, explaining that data virtualization has changed the way BoA thinks about information access.

"The data virtualization tool is hiding the differences between sources, so even though you may be getting your data in real-time, you can configure [your tool] to make it all look like a set of tables, to get the uniformity that you need," he explained. "You can build in data conformance into the virtual views so that you eliminate the differences between reference data sets, and it becomes practical to join across the network. I view that as some of the most special of the special magic that Composite [Informattion Server] provides."

A Thriving Marketplace

Composite isn't the only name in the data virtualization game, of course. Some of its rivals tout thriving data virtualization practices. Denodo Technologies Inc., IBM Corp., and Informatica Corp. are veteran competitors, and more recently SAS Institute Inc. subsidiary DataFlux mounted a fresh challenge of its own.

Informatica executives such as Ash Parikh, director of product marketing for Informatica, are as gung-ho as their Composite counterparts on the issue of DV.

Parikh argues that data virtualization can radically accelerate the data integration timetable, shifting the period from "days or weeks or months down to hours or even minutes." Second, and most important, he continues, DV emphasizes the involvement of business users at a very early stage and promises to give the business an ownership stake in the project.

Informatica says that it isn't just talking about a DV pipedream, either. Officials point to the breadth of an Informatica integration portfolio that includes life cycle management tools such as Data Archive and Dynamic Data Masking, along with bread-and-butter DI technologies such as ETL and data quality.

Adam Wilson, general manager for Informatica's Information Lifecycle Management business, says a product such as Dynamic Data Masking addresses a requirement -- i.e., a need to mask data in real-time, on the fly -- that's going to become increasingly important in the DV-layered architectures of the future.

"Without [employing] complex tokenization, or something that has a much heavier footprint [and which] requires a lot more changes to the underlying database ... it lets you [retain] all of your row- and column-level security at the database-level, so you still have all of your users and groups and privileges," he explains. "This is going to sit in between data sources and applications and allow you to determine what information specific to certain accounts" users can see.

DV, Live Without a Net

Last month, Hills and nearly half a dozen of his colleagues gathered to discuss their use of DV at Composite's Data Virtualization Day (DVD) event in midtown Manhattan. In addition to Bank of America, representatives from Comcast, Compassion International, Putnam Investments, and Qualcomm discussed their very real, very in-production uses of data virtualization.

"A data warehouse is never done. It's obsolete by the time you put it in," said Arvinder Oberoi, senior vice president and director of applications at Putnam.

"Data virtualization helps because if you're not moving the data, there's technically no reconciliation, and it guarantees consistent results," he continued. "Putnam has been a Composite customer since 2006. We got into it because at that time [our] infrastructure was going through some major upheaval, and we thought, 'this would be a nice way to abstract away those source systems … without affecting the consumers. The loose coupling is why we got into it."

In practice, Oberoi said, DV tends to sell itself.

"Once we had it, and we had the single source, and everything's flowing into Composite, it made logical sense that if you have all of the data in one place, you use it for your analysis or your reporting or your BI."

Although some players want to package DV as part of an overall information integration vision -- encompassing ETL, data profiling, data quality, information lifecycle management, master data management (MDM), and other amenities -- at least one speaker at last month's DVD event said he wasn't interested.

Not if such a vision requires the use of vanilla ETL, that is.

"Our main goal is to remove ETL in the traditional sense as much as possible," said Kenny Sargent, technical lead for enterprise data warehousing with Compassion International. "I don't want you to misunderstand what I mean by that. There's no such thing in the distributed world as not having extracts, rows, or transforms. What we try to do is push the transactions as far upstream as possible," continued Sargent, one of the most dynamic and engaging of the day's presenters. "We call it TLT. We want transformations in these reusable views … [where] they're a lot easier to maintain."