Greenplum vs. Cloudera: Nothing New in Hadoop War, But Can the Market Support Both?

With Hadoop as an established framework for delivering Big Data in the enterprise, many companies are betting business on services and solutions for Hadoop deployment. Wikibon’s Dave Vellente and Jeff Kelly have discussed the big Hadoop war between distribution providers at this week’s Strata conference, focusing on the Greenplum-Cloudera competition.

Commenting on Greenplum’s recent announcement of Pivotal HD, its new Hadoop distro, Wikibon analyst Jeff Kelly stated that “bringing SQL tools inside of Hadoop is the strategy for the future,” therefore Greenplum is on the right track. However, the functionality of this new distribution, Greenplum’s fourth attempt, as they already have three more distros, remains to be seen. As this adds to the more exciting distribution developed by the same company, the newly launched Pivotal HD might generate confusion in the market and also yield subsequent integration and functionality issues that Greenplum will have to address.

He added that while he likes the company’s vision for Hadoop, the distribution’s performance and execution remain to be seen. However, Pivotal HD is a great solution to compete against Cloudera‘s Impala.

Dave Vellente explained that the friction between Impala and Greenplum is not new, as the seeds for it had been planted back in 2011 at EMC World, when Greenplum announced a MapR integration after previously having developed Impala connectors. Commenting on Hadoop players’ strategy sessions, he notes that the core train of thought would be “if we can own the platform, we can make more money.”

Analyzing the percentages of the Big Data market and how it’s divided between software/hardware/services, Jeff Kelly indicated that services hold the biggest chunk. In the future, the value will shift towards professional services.

Debating the number of Hadoop distributions currently on the market, both Jeff Kelly and Dave Vellente agreed that the current number of six cannot be sustained on the long run. As it happens with other types of platforms, the first developer makes most money, the second less, the third barely makes any money and the other competing platforms have little chance of success. The market for Hadoop distributions will only sustain about three such products in the long run.