Running at Google Scale With the Zeta Architecture

Google has set the standard for most of the world when it comes to running systems at scale. It has created a number of different technologies to benefit its business. It built those technologies in a way that makes sense for its business, but it has also written many white papers to share these technologies with the rest of the world. This is, after all, how the entire Hadoop ecosystem came to fruition. Google white papers have also inspired many other great open source projects such as Apache Drill, Apache Mesos, and Apache Spark.

All of these white papers have a common theme in that they solve problems that Google has faced. What has been missing until now, however, has been a way of bringing these technologies together such that any data-centric organization can benefit from the capabilities of each technology across its entire data center, and in new ways not documented by any single white paper. This is called the “Zeta Architecture.”

Enterprise architecture should focus on a holistic approach to deliver effectiveness, efficiency, agility, and durability. These points should underlie nearly every enterprise application, but all too often those applications lack holistic enterprise architecture. This may lead to missed opportunities, complex business processes, and even business continuity issues. The Zeta Architecture is an enterprise architecture built with big data and real time in mind.

The Zeta Architecture lays out the foundational premise for a data-centric enterprise and is comprised of seven tenets:

Distributed File System

Solution Architecture

Real-Time Data Storage

Enterprise Applications

Pluggable Compute Model/ Execution Engine

Dynamic and Global Resource Management

Deployment/Container Management System

Google has never formally documented its enterprise architecture for public consumption, but when looking at the seven components of the Zeta Architecture, it becomes quite clear that this is its foundational approach. These are the technologies that Google has created, uses, or contributes to, and are listed in the same order as the components of the Zeta Architecture:

GoogleFS, Colossus

cgroups, Kubernetes

Spanner, Megastore, BigTable, F1

Recommenders, Machine Learning

BigQuery, Dataflow, Dremel, MillWheel

HTTP Servers, Gmail

Borg, Omega

Let’s look at this in a couple of different ways. The first will be through the eyes of Gmail. Gmail is delivered via a web server, which has recommendation engines running under it to deliver advertisements as well as other machine learning libraries to handle things similar to spam. As a company, it deploys more than 2 billion containers per week, some of which will contain those pieces of software that support Gmail. It uses Spanner as the real-time store for Gmail. It uses compute engines similar to Dremel for analytics. All of these tools store their data in Colossus or GoogleFS. To round out this entire use case, Google uses Borg and Omega to manage all of the resources globally. When it needs more instances of any of those pieces of software, it can spin them up dynamically.

The second use case is Google BigQuery. This is a service offering for querying big data at scale, which scales dynamically. Using the same technologies mentioned above, we can see the foundation to support the BigQuery. Depending on the size of the dataset, Google automatically gives BigQuery more horsepower via the scheduling system Omega. The query performance can be optimized based on the size of the data and the number of computers needed to process the data in a reasonable time. This same functionality exists in Apache Drill. The piece that Drill doesn’t currently handle is auto-scaling. In the Zeta Architecture, it is realistic that a Mesos framework could support Drill in this endeavor. Just imagine running queries across massive quantities of data that always fall within a service level window.

This architecture is the real secret to running at Google scale. With the proper technologies, not only can you dynamically scale your data-centric applications to handle anything in real time, but business processes and your overall system design can be simplified, dramatically reducing costs. Complex operational processes and procedures for things like security, disaster recovery, deployment management, and even contingency planning are pulled together in a holistic and seamless way.

Blog Sign Up

James A. Scott (prefers to go by Jim) is Director, Enterprise Strategy & Architecture at MapR Technologies and is very active in the Hadoop community. Jim helped build the Hadoop community in Chicago as cofounder of the Chicago Hadoop Users Group. He has implemented Hadoop at three different companies, supporting a variety of enterprise use cases from managing Points of Interest for mapping applications, to Online Transactional Processing in advertising, as well as full data center monitoring and general data processing. Jim also was the SVP of Information Technology and Operations at SPINS, the leading provider of retail consumer insights, analytics reporting and consulting services for the Natural and Organic Products industry. Additionally, Jim served as Lead Engineer/Architect for Conversant (formerly Dotomi), one of the world's largest and most diversified digital marketing companies, and also held software architect positions at several companies including Aircell, NAVTEQ, and Dow Chemical. Jim speaks at many industry events around the world on big data technologies and enterprise architecture. When he's not solving business problems with technology, Jim enjoys cooking, watching-and-quoting movies and spending time with his wife and kids. Jim is on Twitter as @kingmesal.