Hortonworks bolster Data Platform with 1.1 release

With their competitors treading into the alpha waters of Hadoop 2.0, Hortonworks hold steady with their Data Platform adding in some nifty new features

Not much has gone on in the world of Hadoop since June’s
big data extravaganza at Hadoop Summit 2012, which key vendors used
as a soapbox to outline their strategies for the coming
year.

Whilst Cloudera and MapR chose to pursue the possibilities of
Hadoop 2.0 (still in alpha), Yahoo! spinoff Hortonworks unveiled
their long-awaited Hortonworks Data Platform that would stick to
the proven codebase. It left many puzzled,but upon reflection it was a wise decision to opt for
reliability with the enterprise world still having cold feet over
Hadoop distributions. But what else can Hortonworks offer that’s
innovative?

Three months on and the team have releasedHortonworks Data Platform 1.1, with some
notable additions to the stack that calm concerns that they were
being too conservative.

Arguably the biggest improvement comes in the form of extra
high availability options, to include the latest versions of Red
Hat Enterprise Linux. This is quite a big deal, opening up the
options to newcomers, who can now use Linux or solutions from
VMware.

Elsewhere, data streaming catcherApache
Flumemakes its debut within the distribution, to
help Hadoop get over its real-time headache. The incubating project
aims to provide a distributed available system to collect the
morass of log data into one centralised store. It has already
created a big stir, promising to glean insight from the data that
was too clunky for the old Hadoop to deal with.

Ops is another enterprise stumbling block.
Keeping tabs on the Hadoop infrastructure caninitiallybe overwhelming, so HDP 1.1 has
created a deeper ops-centre of sorts to manage clusters and
integrate more third-party tools, all in one place. There’s also
the claim that this platform makes mincemeat of MapReduce jobs,
with a 10% performance boost.No mean
feat when you consider the already swift nature of
Hadoop.

Withthe final ApacheHadoop 2.0
still some time away, Hortonworks’ decision to get the best out of
the current stable release could prove an astute business move.
Offering substantial high availability options on steady grounding,
and crucially before the next-generation MapReduce (YARN) and HDFS
properly come to fruition might give Hortonworks the
edge.