Teradata Brings Graph Analysis To SQL

Teradata SQL-GL engine adds a big data analysis option to the Aster Discovery Platform. So what's left for Hadoop?

Teradata on Tuesday introduced a major upgrade of its Teradata Aster Discovery Platform with a twist: a new SQL-based graph-analysis engine to go along with the platform's existing SQL query and SQL-based MapReduce capabilities.

The new graph engine, called Teradata SQL-GR, is geared to analyze relationships, such as those in social networks or across phone or mobile networks. The graph engine rounds out what's now called a "Snap Framework" of analysis options all based on SQL.

To better support multi-structured data, the Teradata Aster 6 release also adds a file store, akin to the Hadoop Distributed File System (HDFS), called the Teradata Aster File Store. The upgrades send a clear message: you can now do all your analysis of data -- whether structured or multi-structured -- with the non-threatening ease and familiarity of SQL. No high-priced data scientists required.

"Hadoop is a great technology, but it requires different engines that don't talk to each other and that require different languages," Teradata product and services marketing manager Chris Twogood told InformationWeek in a phone interview. "With the Snap Framework, we're the first in the industry to deliver an integrated process for discovery."

What Teradata's announcement leaves less clear is what, exactly, companies are supposed to do with Hadoop -- a platform Teradata supports as a reseller and support provider for the Hortonworks Data Platform (HDP). Twogood's answer is that Hadoop is good for high-scale, low-cost storage -- the analytic archive or data lake role -- and for high-scale data-transformation jobs.

In this vision, Hadoop handles the scale while Teradata Aster is a smaller-scale, transient discovery platform used for day-to-day analysis. The Teradata Aster File Store is compatible and complimentary to HDFS, Twogood insisted. Data subsets can be copied from HDFS to the Teradata Aster File Store, or analysts can use SQL-H, Teradata's SQL-on-Hadoop option, to grab data from Hadoop on the fly, he said.

Teradata's vision supposes that enterprises are prepared to support three separate data platforms to manage information: Teradata for in-database analysis of structured data, Hadoop for high-scale, multi-structured data, and Teradata Aster for SQL, SQL-MapReduce and SQL-graph analysis of boiled-down and/or blended data sets.

Teradata acquired Aster Data in 2011 for $263 million, but that was before Teradata struck up its partnership with Hortonworks. The company was clearly still keen to push Aster. Last fall Teradata introduced a big data appliance combining HDP Hadoop nodes and Aster nodes. It also introduced a free developer instance of the database called Teradata Aster Express.

There are signs the Aster-Hadoop combo appliance was not a big seller. In June, Teradata deepened its partnership with Hortonworks by agreeing to resell and support HDP, and it also introduced the Hadoop-only Teradata Hadoop Appliance. Teradata declined to provide details on just how many customers are using Teradata Aster, but Twogood insisted it's gaining new customers.

As for the big data realm, Twogood described "disparate tools and processes" involving graph database options such as Neo4j and Sparql, which he dismissed as memory bound and not terribly scalable. He also dismissed the laborious writing of MapReduce jobs and the use of Pig programming as something totally foreign to SQL practitioners.

Teradata's criticisms call into question to what degree the company is counting on SQL-H and other emerging developments to bring faster and easier data-analysis options to Hadoop.

"There is work being done on Hadoop, like [Hortonwork's] Tez and [Cloudera's] Impala and the [Apache] Giraph graph database and MapReduce, but all of those things require different users, different technologies and different languages, and you need the data scientist to be the optimizer," said Twogood. "With Aster, we're delivering all of these engines but we're removing all that complexity."

Teradata isn't alone in promoting SQL as a friendlier route to big data analysis. Speaking at Oracle Open World last month, Oracle executive Andy Mendelsohn dismissed Hadoop as "primitive and batch oriented" and demonstrated a comparison of a seconds-long SQL analysis to a long-running MapReduce job that required 600 lines of code. There was no mention of SQL-on-Hadoop options and no detail on the variability and volume of the data used in the demo analysis.

With Tuesday's announcement, Teradata is alone in supporting MapReduce and graph analysis as well as standard SQL on one platform, so it's ahead of competitors Oracle, IBM and EMC-spinoff Pivotal in that regard. The breadth of analysis options is clearly a good thing for Teradata Aster customers.

As for those seeking to do more, not less, with Hadoop, Teradata Aster 6 might seem like a SQL sideshow.

IT leaders must know the trade-offs they face to get NoSQL's scalability, flexibility and cost savings. Also in the When NoSQL Makes Sense issue of InformationWeek: Oregon's experience building an Obamacare exchange. (Free registration required.)

Most IT teams have their conventional databases covered in terms of security and business continuity. But as we enter the era of big data, Hadoop, and NoSQL, protection schemes need to evolve. In fact, big data could drive the next big security strategy shift.

Why should big data be more difficult to secure? In a word, variety. But the business won’t wait to use it to predict customer behavior, find correlations across disparate data sources, predict fraud or financial risk, and more.