My client Teradata bought my (former) clients Revelytix and Hadapt.* Obviously, I’m in confidentiality up to my eyeballs. That said — Teradata truly doesn’t know what it’s going to do with those acquisitions yet. Indeed, the acquisitions are too new for Teradata to have fully reviewed the code and so on, let alone made strategic decisions informed by that review. So while this is just a guess, I conjecture Teradata won’t say anything concrete until at least September, although I do expect some kind of stated direction in time for its October user conference.

*I love my business, but it does have one distressing aspect, namely the combination of subscription pricing and customer churn. When your customers transform really quickly, or even go out of existence, so sometimes does their reliance on you.

HadoopDB tied a bunch of PostgreSQL instances together with Hadoop MapReduce. Lab benchmarks suggested it was more performant than the coyly named DBx (where x=2), but not necessarily competitive with top analytic RDBMS.

Hadapt was formed to commercialize HadoopDB.

After some fits and starts, Hadapt was a Cambridge-based company. Former Vertica CEO Chris Lynch invested even before he was a VC, and became an active chairman. Not coincidentally, Hadapt had a bunch of Vertica folks.

Chris Lynch, who generally seems to think that IT vendors are created to be sold, shopped Hadapt aggressively.

As for what Teradata should do with Hadapt:

My initial thought for Hadapt was to just double down, pushing the technology forward, presumably including a columnar option such as the one Citus Data developed.

But upon reflection, if it made technical sense to merge the Aster and Hadapt products, that would be better yet.

I herewith apologize to Aster co-founder and Hadapt skeptic Tasso Argyros (who by the way has moved on from Teradata) for even suggesting such heresy. 🙂

Complicating the story further:

Impala lets you treat data in HDFS (Hadoop Distributed File System) as if it were in a SQL DBMS. So does Teradata SQL-H. But Hadapt makes you decide whether the data is in HDFS or the SQL DBMS, and it can’t be in both at once. Edit: Actually, see Dan Abadi’s comments below.

Impala and Oracle’s new SQL-H competitor have daemons running on every data node. So does one option in Hadapt. But I don’t think SQL-H does that yet.

I was less involved with Revelytix that with Hadapt (although I’m told I served as the “catalyst” for the original Teradata/Revelytix partnership). That said, Teradata — like Oracle — is always building out a data integration suite to cover a limited universe of data stores. And Revelytix’ dataset management technology is a nice piece toward an integrated data catalog.

Also, just to clarify what you mean by “Hadapt pivoted” — our messaging/positioning did indeed pivot as you mentioned earlier in the paragraph (thanks in part to your advice). However, we continued to invest in our SQL-on-Hadoop solution, significantly improving the SQL support over the past year. I suspect that our core SQL-on-Hadoop product was a major component of Teradata’s decision-making process.

[…] appears to have been a fire sale) the assets of SQL-on-Hadoop pioneer Hadapt. (Analyst Curt Monash published some good insights on both deals at the time.) Randy Lea, president of Teradata’s big data practice, said the Revelytix deal […]

Now that Teradata bought Hadapt, it would be interesting to hear what prof. Abadi thinks now i.e. are there any changes in position or major premise of his article from 2012 ( that Hadoop needs to compete heads on with RDBMSs )? It is quite ironic that first potential Hadoop target ( Teradata ) is firing back and buying one of Hadoop products that was perhaps supposed to replace it. Or not – it is actually quite common ( Oracle buying MySQL, for example ).

@Ranko Teradata’s acquisition of Hadapt actually validates the premise of the article you cite in your post. Connector-based strategies for integrating data between Hadoop and analytic databases are, in general, slow and inefficient. Importantly, they completely miss the opportunity to combine Hadoop’s resilient, shared-nothing parallelism with the node-level throughput of analytic databases. And there is a growing population of analytic applications that could benefit from just such a combination.

As Curt pointed out, HadoopDB (and Hadapt, its commercial successor) use PostgreSQL for node-level storage and query processing, all while exposing a SQL interface to the application tier. Imagine the possible benefits of replacing PostgreSQL with high performance columnar database nodes. The result would blend the best of Hadoop’s parallel communications framework with the proven query throughput of a column store (or read-optimized row store), without the need for clunky ETL-style connectors.