Beyond Hadoop-as-a-Service: The Opportunity for Big-Data-as-a-Service

I’ve written in the past about the opportunity for Hadoop-as-a-Service (HaaS) – providing self-service provisioning, elastic scaling, and support for multi-tenancy. But in my discussions with customers over the past year, it’s become clear that the opportunity is even bigger than Hadoop. The next big thing in big data is Big-Data-as-a-Service (BDaaS).

There are three key trends driving the evolution and emergence of this new BDaaS opportunity:

Apache Spark and the evolving big data ecosystem. Hadoop recently celebrated its 10th birthday and continues to gain widespread adoption. But in recent years, other new big data frameworks and tools have also gained in popularity. Foremost among these is Apache Spark, the most active open source project in big data. We’re also seeing increased interest in Kakfa, Flink, NoSQL technologies such as Cassandra, and much more. And there continues to be rapid innovation in the commercial software market for big data – including analytics, ETL, search, log analytics, and other BI tools. Hadoop is still at the forefront (and many of these tools complement and extend Hadoop), but BDaaS is much more than Hadoop.

Enterprise adoption of containers and microservices. Container and microservices technology (Docker in particular) has taken hold in the enterprise, and the pace of adoption has accelerated over the past year. Like Spark, Docker has become one of the fastest growing open source technologies ever. Application developers have embraced the simplicity and agility of containers, and microservices are a foundation of the DevOps model. Enterprise IT teams have made containers part of their architecture strategy. And the container revolution is now being extended to big data applications.

The cloud experience for big data, with no compromises. Until recently, big data deployments were almost exclusively bare metal on-premises. But now data scientists, analysts, and developers in the line of business want the cloud experience; they want self-service, on-demand clusters, elasticity, and DevOps agility with all their big data tools. There are several public cloud services for Hadoop and Spark, but there are important factors that prevent many big data workloads from moving to the public cloud – including performance, security, compliance, and data gravity. Data gravity means that data that already resides on-prem is likely to stay on-prem due to the cost, risk, and challenges of moving very large volumes of data. Using container technology and next generation big data infrastructure, customers can have the BDaaS cloud experience and the enterprise-grade performance, security, compliance, and high availability required for big data workloads on-premises.

To learn more about Big-Data-as-a-Service, register for our upcoming joint webinar on September 15th with BlueData, a software company that provides an innovative platform for BDaaS using Docker containers, and a Cisco Solution Partner: http://bit.ly/2bH3EJA

Excellent article.
These trends lay out Cisco's opportunity very well.
Container technology and Microservice architecture are complementary.
A good example: describe how we decomposed a "monolithic" application to a set of individual microservices and containerized them with Docker.
Thanks,
Burt

Some of the individuals posting to this site, including the moderators, work for Cisco Systems. Opinions expressed here and in any corresponding comments are the personal opinions of the original authors, not of Cisco. The content is provided for informational purposes only and is not meant to be an endorsement or representation by Cisco or any other party. This site is available to the public. No information you consider confidential should be posted to this site. By posting you agree to be solely responsible for the content of all information you contribute, link to, or otherwise upload to the Website and release Cisco from any liability related to your use of the Website. You also grant to Cisco a worldwide, perpetual, irrevocable, royalty-free and fully-paid, transferable (including rights to sublicense) right to exercise all copyright, publicity, and moral rights with respect to any original content you provide. The comments are moderated. Comments will appear as soon as they are approved by the moderator.