HP Briefs on Big Data, Analytics with Morgan Stanley’s Help

By Tiernan Ray

This evening, Hewlett-Packard (HPQ) held a webcast, moderated by Morgan Stanley analyst Katy Huberty, to discuss its big data software product, Vertica, featuring a slide show and discussion by HP’s head of that unit, Colin Mahony.

There were no new announcements in the presentation, which was an introduction to technical aspects of the program, and a Q&A comparing the product to offerings from International Business Machines (IBM), SAP AG (SAP), Teradata (TDC), EMC (EMC), and Oracle (ORCL).

Mahony explained the main features of Vertica, which the company acquired in 2011, after it was founded by database pioneer Michael Stonebraker in 2005.

Vertica is for use in querying large amounts of data to analyze patterns, the intersection of the big data and analytics crazes.

The main features are a “columnar” data structure that Mahony said is “much, much faster” than other forms of retrieving data for analysis. Where there are queries acting against multiple data points, the columnar format can be much more rapid than a traditional relational database approach, Mahony explained:

If you ask to see every male within 100 miles of San Francisco between the ages of 30 and 60, we know with that you’re requesting an analytic workload that requires three variables, age, address, and gender. By storing all those together on disk, we can get it fast and return it to the user. By adding more columns, and not taxing the system as a row store, where every query has to go through every row, the results can be much faster.

Mohony says there’s a big push by HP to sell Vertica through the “freemium” licensing model. Meaning, you download a free copy at Vertica.com, you can try it out against a data set of up to one terabyte, and then if you want to get more serious you can pay for a license.

Mahony said HP is not trying to replace “data warehouses” that companies have spent years building. “Our approach is never to say get rid of the data warehouse, that’s not a good strategy,” said Mahony. “Instead, we say surround the warehouse, and go after analytic use cases, the killer queries.” Nor is Vertica meant to be used for transactions, where online transaction processing (OLTP) systems such as Oracle’s database still serve their intended purpose.

Mahony was asked about competition from a number of products, including SAP’s “HANA” in-memory database, EMC’s “Greenplum” software, IBM’s “Netezza,” and Teradata and Oracle’s dedicated analytics machines.

HANA is not really a competing product he said. HP and SAP are “great partners,” and HANA is more about in-memory storage of data. “I applaud many of their core design principals,” which include a columnar format, he said. But the in-memory database will take a while to have the aggregate storage capacity for the many terabytes of data that Vertica is designed to sift through. Vertica itself integrates with the clustered file system storage technology known as Hadoop.

The Teradata and Oracle hardware is proprietary, versus the HP servers’ “open standards,” he said, which means more flexibility.

“Unlike Netezza or Teradata, what people in the industry call proprietary refrigerators, with Vertica and HP, you get the experience of an appliance but also know that you’re not locked in. You can swap out hard drives, for example, you can scale it out to your needs.”

I would note that HP seems to be on something of a campaign to have hosted presentations of its technology these days. The company will hold a briefing a week from today with HP’s head of its networking division, Bethany Mayer, regarding software-defined networks, hosted by ISI Group analyst Brian Marshall. You can catch the webcast of that conference here.

Add a Comment

We welcome thoughtful comments from readers. Please comply with our guidelines. Our blogs do not require the use of your real name.

Comment

There are 3 comments

MARCH 6, 2013 10:17 A.M.

Anonymous wrote:

hp must be a big advertiser with the Journal?

MARCH 6, 2013 1:10 P.M.

Paul Lynch wrote:

Pretty lame reason to buy a special niche product. Retrieving the maybe 4 million rows for "every male within 100 miles of San Francisco between the ages of 30 and 60" is not a big deal today. You would only be looking at, maybe 10G of data. Most or all of this data would be in main memory, flash memory, or disk frame cache. Many techniques in modern databases can speed this up in a variety of ways.

In addition to being able to handle large volumes of data, including data in Hadoop, SAP HANA is optimized not only for analytical workloads, as the databases mentioned in this article, but also for transactional workloads such as the SAP Business Suite. This gives users the ability to not only ask the tough questions of their data, but to do so directly on their transactional data, completely eliminating the time delay of hours to weeks that exists today between transaction and analytics with other vendors' solutions.

About Tech Trader Daily

Tech Trader Daily is a blog on technology investing written by Barron’s veteran Tiernan Ray. The blog provides news, analysis and original reporting on events important to investors in software, hardware, the Internet, telecommunications and related fields. Comments and tips can be sent to: techtraderdaily@barrons.com.