DW Appliances: Kognitio to Debut in U.S., Netezza to Scale into the Stratosphere

The data warehouse (DW) appliance market is starting 2008 with two important announcements.

The latest development in that already-teeming market segment: the reemergence of an old-new name, the former Whitecross Systems, which plans to relaunch next month -- at the TDWI World Conference in Las Vegas -- as Kognitio.

This isn't strictly true: Kognitio is already a going concern in the U.K., where it absorbed the assets of the former Whitecross back in 2005. Last year, the company finally shifted its attention to the North American market, tapping industry veteran John Thompson to head up its U.S. marketing operations. Whereas Whitecross was a somewhat familiar name in the U.S., the "new" Kognitio will be promoting its own data-as-a-service spin on the (increasingly ubiquitous) DW appliance.

The question, of course, is the degree to which Kognitio's data-as-a-service mantra is different, and what its hardware-independent underpinnings will mean. Kognitio will be making its splash as DW appliance veterans seek to shore up -- or in many cases expand -- their own U.S. market positions.

Consider Netezza Inc., the company (along with the Teradata Corp.) that is widely credited with helping legitimize and popularize the DW appliance model. This week, Netezza announced plans to expand the footprint of its Netezza Performance Server (NPS) appliances by several times their current capacity limit -- announcing upcoming enhancements (including an already-promised Compress Engine) that will result in appliance capacities of greater than 200 TB, scaling, according to Netezza officials, all the way up to 1 PB.

A Confident Kognitio

Netezza's increased capacity doesn't seem to faze Kognitio officials, who trumpet their own expertise -- as well as that of predecessor Whitecross -- in the high-volume data warehousing segment.

More to the point, argues Thompson, Kognitio's spin on the DW appliance -- which it markets in the form of its WX2 data warehousing software -- differs significantly from those of its established U.S. competitors: it doesn't have an official hardware complement. Kognitio officials claim that customers can deploy WX2 on top of existing hardware assets.

"You take Teradata, they're in my mind the original data warehouse appliance, and you also have folks out there like Netezza and Dataupia and DATAllegro, which is sort of a blended approach [of hardware and software]. Then you have the next version appliance, which is WX2, which is software only," Thompson says. "When we come into an environment and the client says, 'I have five rack-mounted servers and I want to use them for X application,' we can do that. We can run right on those existing servers. Teradata and Netezza or any of those other [appliance] vendors can't do that."

In this respect, Kognitio sounds a lot like another DW vendor, ParAccel, which markets a columnar data warehouse technology customers can deploy on top of existing assets or order preinstalled and preconfigured on hardware assets from Sun Microsystems Inc. and other OEMs.

The resemblance is there, Thompson concedes, but there are key differences. For starters, WX2 doesn't use a columnar database structure. While columnar databases do have undeniable advantages, they can also be difficult to configure and optimize, he argues. "We are a traditional relational database. We don't use any indexing. We don't rely on any segmentation in our partitioning. We allow people to bring data in, add data as rapidly as they want, take data out. People like Vertica and ParAccel are coming back with a columnar approach using compression, and there are benefits to that, but -- as anyone who's ever built one of these [columnar warehouses] knows -- when you build these massive hypercubes, you have to do it three or four times to get it right."

Secondly, Thompson claims, WX2 encapsulates the domain expertise that Whitecross developed during its days as an application service provider (ASP). Consider costing, which WX2 can compute on both a technological (i.e., how much will a specific query cost to run in terms of system resources or processing power) and a dollars-and-cents (i.e., how much will a specific query actually cost the business unit to run) basis.

The takeaway, he says, is a kind of service-enabled spin on chargeback.

"[T]he [WX2] software was written in a way so that if you write SQL and send it into the machine, the first layer that grabs it is what we call an optimizing compiler. This looks at the SQL and decides whether or not to run it natively or to convert it into machine code," he explains. "But it also does a costing allocation -- which is based on machine resources, which can then be translated into a dollar threshold -- and you can take that costing allocation and convert it into dollars. So if [that allocation] comes in and says, 'This is going to cost over X amount,' it will kick it back to the user and say, 'Are you sure you want to run this?'"

There's more, too, according to Thompson, who cites WX2's on-the-fly resource allocation (and deallocation) capabilities. What this means, he says, is that customers can allocate additional resources to meet changes in demand, deallocate resources as needed, or even reconfigure an existing data warehouse environment to handle a completely different workload.

"Say, during the day, you might want more of a traditional transactional [workload], but in the evening, you want to run a regression analysis looking at products in relation to one another," he explains. "That's a very processor-heavy configuration, but we can automate that. At 5:00 PM, we can reconfigure that and then set it back to the reporting profile in the morning."

Kognitio plans to target a sizeable market swathe. Thompson doesn't rule out the sub-TB segment, for one thing, and stresses that WX2 can scale to address multi-TB (and even double-digit TB) requirements, too. Licensing, he says is flexible: it's available on a per-user, per-seat, per-processor, per-server, or even per-capacity (e.g., 10 TB or less) basis. "The key is that we want to be known as the flexible alternative to these other [appliance] vendors," he indicates.

Late last year, the firm announced plans to deliver a new Compress Engine feature for its NPS systems. That feature is slated to become available by the middle of this year. Ditto for Netezza's upcoming NPS expansion, which it says will result in DW appliance configurations that scale beyond the 100 and 200 TB barrier -- scaling, in some cases, on up to 1 PB, according to officials.

Compression (or "compressability") is of growing importance in the high-end data warehousing segment. For one thing, it lets customers use fewer physical disk drives or storage arrays to support ever-larger data warehousing configurations. Furthermore, a reduction in physical storage translates into a corresponding reduction in power, cooling, and data center real-estate costs.

Contrary to what you might think, however, Netezza isn't re-architecting its relational data warehouse format -- although its Compress Engine does translate data into a kind of hybrid columnar database structure. (Columnar databases tend to boast extremely high compression levels.) Mostly, officials claim, Netezza relies on the processing power of its field programmable gate arrays (FPGA) -- i.e., the PowerPC processor engines that populate its Snippet Processing Units (SPU) -- as well as proprietary compression algorithms, the combination of which lets it achieve extremely high on-the-fly compression rates, says Phil Francisco, director of product marketing with Netezza.

"What the Compress Engine is is an extension of that FPGA capability to do that recompression of the data as it comes off the disk drive as fast as you can read it from the disk. You get sort of maximum performance on the throughput of the system. We do [perform] the compression in a columnar way, but the system is still a load-based database management system. The [compression] algorithm that we use is our own patent-pending one," he explains.

Chalk it up to the advantage of Netezza's PowerPC-based architecture, which is cooler and more efficient than competitive designs based on chips from Intel Corp. and Advanced Micro Devices (AMD) Inc. "We can buy embedded versions of PowerPC that use very little power and allow us to be very power efficient, and that leads to a really significant power savings for customers purchasing our solution," Francisco claims.

"Each one of those [processing units] consumes only 30 watts per power and we can put a hundred of those in a rack and get very high efficiency," he continues, noting that competitive chips (such as multi-core designs from Intel and AMD) dissipate several times as much power. "We can deliver better data densities and not sacrifice performance in doing that -- we'll be able to actually increase performance in doing so."

Netezza isn't sweating the heightened competition in the DW appliance segment, either, according to Francisco. "I like where we are in the market. I like our opportunity and I like what we see in front of us."