The Hortonworks Model

This is a unique moment in time. Fueled by open source, Apache Hadoop has become an essential part of the modern enterprise data architecture and the Hadoop market is accelerating at an amazing rate.

The impressive thing about successful open source projects is the pace of the “release early, release often” development cycle, also known as upstream innovation. The process moves through major and minor releases at a regular clip and the downstream users get to pick the releases and versions they want to consume for their specific needs.

Assembling a complete platform like HDP requires choosing the right stable version of Apache Hadoop as the foundation and then integrating, and packaging the optimal versions of all the other ASF components into a well-tested, certified data platform.

Working within the community for the enterprise

Since we are committed to delivering HDP completely in the open, we introduce enterprise feature requirements into the public domain and we code to address those requirements. We code and we contribute everything back to the wide array of ASF projects in HDP. Why? Because we want the Hadoop market to work at scale, and in order to make this happen, we know code is king, so we practice what we preach. We contribute everything.

Acquire and ContributeAcquire innovative companies and contribute the IP to the ASF as an Apache incubator project. We acquired XA Secure in 2014 and flipped this commercial software for comprehensive security into open source as Apache Ranger.

Partner and DeliverEstablish joint engineering relationships to accelerate Enterprise Hadoop innovation. Our deep joint engineering work with Microsoft, HP, SAS, Pivotal, Red Hat, Teradata and others are great examples. Microsoft’s recent launch of its Azure HDInsight service on Linux is a great example of what comes from joint engineering.

The enterprise-focused initiatives are an important element of our approach. The Stinger Initiative has successfully rallied contributions from hundreds of developers across dozens of companies in order to address SQL in Hadoop needs for the enterprise. The Data Governance Initiative was recently formed with Aetna, Merck, Target and SAS to address the data stewardship, lineage, lifecycle management, and privacy issues that are increasingly important.

Open Data Platform Initiative (ODP)

The Open Data Platform initiative (ODP), announced this week, aims to rally enterprise end users and vendors alike around a well-defined common core platform (the ODP Core) against which big data solutions can be qualified.

How does the ODP relate to the ASF?

It’s simple. All upstream production happens within the ASF projects according to the ASF governance model. Individuals working for ODP member companies are encouraged to participate and contribute to ASF projects as they see fit and in accordance to ASF processes. Since Hortonworks engineers do all of their coding in these ASF projects, we’re more than happy to help newcomers learn the Apache way and contribute.

The ODP, on the other hand, is focused on enabling downstream consumption of a common set of Hadoop-related components, and more importantly, specific versions of those components. Harmonizing the broader market around Apache Hadoop version 2.6 and Apache Ambari 2.0, for example, will help simplify the onboarding of manageable, YARN-based solutions that can ride atop the common core platform.

Increasing the compatibility among Hadoop-based platforms and solutions will free up the broader big data ecosystem to focus on more important things such as data-driven applications that deliver proactive insights for the business. Innovation will advance even faster in the market with all the ODP members building upon the same downstream Apache Hadoop kernel- Apache Hadoop, Yarn and Apache Ambari.

Modern platform standards are defined by open communities

At Hortonworks, our founding belief is that innovation and adoption of platform technologies like Hadoop is best accomplished through collaborative open source development under the governance model of an entity like the Apache Software Foundation (ASF).

In order to enable a data platform like Hadoop to be easy to use and enterprise-grade, you don’t go it alone. You do it by working with your customers and the broader ecosystem to enable:

data architects to deeply integrate existing systems with Hadoop;

developers, data workers, and analysts to build applications quickly and easily; and

operators and security administrators to deploy, manage, secure, and govern the platform and the applications deployed on it in a consistent way.

Our approach to the market is about enabling our customers to embrace Hadoop in a way that makes sense for their business. It’s about enabling our partners in a way that drives joint value from the alliance in a way that’s respectful to each other in the process. And it’s about rallying a community in a way that drives innovation around shared goals.

Done right, open source promotes an equitable balance of power; done together, it offers a fair exchange of value between vendor and consumer.

Tags:

Your email address will not be published. Required fields are marked *

Comment

Name*

Email*

Related Posts

BLOG

9.6.17

Hospitality and the Big Data Window...

What eventually drives us to purchase something online? This isn’t a hypothetical or philosophical question, as businesses are forced to find answers every day. Consumers across every industry have access to almost anything they can find on a search engine, however with such a big supply of goods, businesses must manage the Big Data that’s…

TMW: From Adoption to Profitability

The San Jose DataWorks Summit is this Tuesday! Have you registered? A few weeks ago, we announced Tim Leonard of TMW Systems would be presenting a Tuesday keynote, Commoditizing Your Data to Sell – A Transportation Example. Lessons from this keynote are vital to any small business looking to compete with major players in an industry.…

Vizient: Predictive Analytics in Healthcare

The critical business challenge for healthcare organizations is to effectively manage their data. Success means access to real-time market data, data visualization, and cost-saving opportunities. Data virtualization and predictive analytics further improve both the business side of healthcare organizations, and patient care. At San Jose DataWorks Summit (June 13-15), Vizient will show how predictive analytics helps connect members…

Parametric: Driving Alpha in the Financial...

With the San Jose DataWorks Summit (June 13-15) just one month away, we’re busy finalizing the lineup of an impressive array of speakers and business use cases. This year our Enterprise Adoption Track will feature Scott Sovine, Director of Project and Data Management, and Amir Aliabadi, Data Architect, both at Parametric Portfolio Associates, LLC. Also present will be…

Hortonworks 2016 Year in Review

As we kick off the new year I wanted to thank our customers, partners, Apache community members, and of course the amazing Hortonworks team, for an amazing 2016. Let’s take a step back and look at some of the Hortonworks highlights from last year... IN THE ECOSYSTEM there was tremendous acceleration. At the beginning of…

The Power of your Data Achieved:...

It’s no secret that there is a data explosion. A recent IDC analyst report from April 2014 indicated the volume of data, known as the digital universe, is doubling in size every two years. And by 2020, there will be as many digital bits as there are stars in the universe. There are many reasons…

Jumpstart Your Digital Transformation with Hadoop...

Guest author: Jeff Kelly, Data Strategist, Pivotal The phrase “digital transformation” gets bandied about a lot these days, but what exactly does it mean? When you strip away the hyperbole, I believe digital transformation is the process by which enterprises evolve from using traditional information technology to merely support existing business models to adopting modern…

What’s the best cloud architecture—and how...

People often think about cloud architecture in simplistic terms: you’re either public, private, or hybrid. (In fact, there’s even confusion about the meaning of the term “hybrid” itself—this video helps clear it up: https://www.youtube.com/watch?v=HPKI-U_ef5w In the real world, of course, virtually every implementation is hybrid—no company puts 100% of its IT environment into one single…