Hortonworks announce Stinger to solve Hadoops real-time headache

With their competition placing faith in new solutions, the Yahoo spinoff opt to renovate and improve Hadoop veteran Hive.

The race to make Hadoop faster in the enterprise world has
heated up. In October, Cloudera unveiled real-time query engine Impala, whilst MapR put
their weight behind Apache Drill, a real-time analytics project.

Now it’s Hortonworks’ turn to show their hand, but curiously,
they’ve opted to revitalise a part of the Hadoop furniture, rather
than offer something new. Alan Gates, co-founder of Hortonworks,
today revealed
details behind the Stinger Initiative, a plan to make
Apache Hive “up to 100 times faster” within their product.

There are several methods behind Hortonworks’ strategy to boost
Hadoop’s data warehousing project. The first is to tune Hive to
focus more on SQL-like queries, or as Gates says, make it “a more
suitable tool for the decision support queries people want to
perform on Hadoop”. Separately, Gates said changes within Hive’s
execution engine will drop querying time to enable the tool to
“answer human-time use cases.”

Aside from heavy tinkering to Hive’s existing structure,
Hortonworks have also announced Tez, a latency-reducing runtime
framework that processes “complex” data tasks. Appearing as a
proposal in the Apache
Incubator yesterday, Tez also works natively with
YARN,
the MapReduce overhaul set to be the centrepiece of Hadoop
2.0.

Gates also explained that the introduction of a new columnar file
format, called ORCFile, within the community would modernise Hive
and make it more efficient at storing data. The company realise
collaboration with heavy Hadoop enterprise users, like Facebook, is
key to see the format gain adoption throughout the community.

“At Hortonworks, we believe in the power of the open source
community to innovate faster than any proprietary offering,”
explained Gates, adding that the “initiative is proof of this once
again as we collaborate with others to improve Hive
performance.

A full preview of the Stinger Initiative is expected at Hadoop
Summit Amsterdam in March. Hortonworks still believe that the
answer to Hadoop’s real-time problem lies in the projects that are
already established, rather than introducing new blood to a packed
ecosystem. If it’s possible to renovate and improve the tools
already present in the enterprise environment, rather than
thrusting new ones onto developers, then Hortonworks’ logic might
ultimately reap the rewards.