Hybrid and Multi-Cloud Patterns and Practices

This article is the first part of a multi-part series that discusses hybrid and
multi-cloud deployments, architecture patterns, and network topologies. This
part explores the opportunities and challenges of hybrid and multi-cloud
deployments, and provides guidance on how to approach and implement a hybrid
setup that uses Google Cloud Platform (GCP).

Digitalization and the need to adapt rapidly to changing market demands have
caused a rise in the requirements and expectations that are placed on
enterprise IT. Many companies find it challenging to accommodate and adapt to
these trends by using existing infrastructure and processes.

At the same time, IT departments often find themselves under scrutiny and
pressure to improve cost effectiveness, making it difficult to justify
additional capital expenditure (capex) investments to extend and modernize data
centers and equipment.

A hybrid cloud strategy provides a pragmatic solution. By using
the public cloud, you can extend the capacity and capabilities of your IT
without up-front capex investments. By adding one or more
cloud deployments to your existing infrastructure, you not only
preserve your existing investments, but also avoid committing yourself
to a single IT vendor. Additionally, by using a hybrid strategy, you can
modernize applications and processes incrementally as resources permit.

Hybrid cloud and multi-cloud

Because workloads, infrastructure, and processes are unique to each enterprise,
each hybrid strategy must be adapted to specific needs. The result is
that the terms hybrid cloud and multi-cloud are sometimes used
inconsistently.

Within the context of GCP, the term hybrid cloud describes a
setup in which common or interconnected workloads are deployed across multiple
computing environments, one based in the public cloud, and at least one being
private.

The most common example is combining a private computing environment, usually an
existing, on-premises data center, and public cloud computing environment, as
the following diagram shows.

The term multi-cloud describes setups that combine at least two public cloud
providers, as in the following diagram.

A multi-cloud setup might also include private computing environments.

Drivers for hybrid cloud and multi-cloud setups

Hybrid and multi-cloud setups might be temporary, maintained only for a limited
time to facilitate a migration. However, these setups might also represent the
future state of most organizations as they build new systems and evolve existing
ones to get the best from each, no matter where the setup runs. Hybrid and
multi-cloud setups might therefore be permanent fixtures in the IT landscape.

A hybrid or multi-cloud setup is rarely a goal in itself, but rather a means of
meeting business requirements. Choosing the right hybrid or multi-cloud setup
therefore requires first clarifying these requirements.

Business drivers and constraints

Common drivers and constraints from the business side include the following:

Architecture constraints

On the architecture side, the biggest constraints often stem from existing
systems and can include the following:

Dependencies between applications.

Performance and latency requirements for communication between systems.

Reliance on hardware or operating systems that might not be available in
the public cloud.

Licensing restrictions.

Overall goals

The goal of a hybrid and multi-cloud strategy is to meet these requirements with
a plan that describes the following:

Which workloads should be run in or migrated to each computing
environment.

Which patterns to apply across multiple workloads.

Which technology and network topology to use.

Fundamentally, any hybrid and multi-cloud strategy is derived from the business
requirements. How you derive a usable strategy from the business requirements is
rarely clear, however. The workloads, architecture patterns, and technologies
you choose not only depend on the business requirements, but also influence each
other in a cyclic fashion. The following diagram illustrates this cycle.

Defining a vision

Within this web of dependencies and constraints, defining a plan that considers
all workloads and requirements is difficult at best, especially in a complex IT
environment. In addition, planning takes time and might lead to competing
stakeholder interests.

To avoid this situation, first develop a vision statement that focuses on the
business perspective and addresses the following questions:

Why is the current approach and computing environment insufficient?

What are the primary metrics that you want to optimize for by using
the public cloud?

How long do you plan to use a hybrid or multi-cloud setup? Do you
consider this setup permanent, or interim for the length of a full cloud
migration?

The vision statement does not address how to achieve these goals.

Agreeing on a vision and obtaining relevant stakeholder sign-off provide a
foundation for the next steps in the planning process.

Designing a hybrid and multi-cloud strategy

After you have settled on a vision, you can elaborate the strategy:

Conduct an initial workload assessment. Considering the goals outlined
in the vision document, identify a candidate list of planned and existing
workloads that could benefit from being deployed or migrated to the public
cloud. The following section discusses this topic in more detail.

Starting with the identified candidate workloads, identify applicable
patterns
and, based on those patterns, candidate
topologies.

If you identify more than one applicable pattern and topology, refine your
workload selection so that you can settle on a single pattern and topology.
Iterate as necessary to refine your selections.

Applying multiple patterns and topologies is a viable approach for large
organizations. But this approach is rarely ideal because of the extra
complexity, which in turn might slow your progress.

Prioritize your workloads. Given the many requirements, it's best
to take an iterative approach.

Select an initial workload to put in the public cloud. Make sure that this
workload is not business critical or too difficult to migrate, yet typical
enough to serve as a blueprint for upcoming deployments or migrations.

While selecting a workload to migrate, start preparing on the
GCP side.

Implement the
network topology
and establish the necessary connectivity between GCP and your
private computing environments.

Workloads

The decision about which workloads to run on which computing
environments has a profound impact on the effectiveness of a hybrid and
multi-cloud strategy. Putting the wrong workload on the cloud can complicate
your deployment while providing little benefit. Putting an appropriate workload
in the right place not only helps the workload, but helps you learn about the
benefits of each environment.

Cloud first

A common way to begin using the public cloud is cloud first. In this approach,
you deploy new workloads to the public cloud. In that case, consider a classic
deployment to a private computing environment only if a cloud deployment is not
possible for technical or organizational reasons.

The cloud-first strategy has advantages and disadvantages. On the positive side,
it's forward looking. You can deploy new workloads in a clean and cloud-native
fashion while avoiding (or at least minimizing) the hassles of migrating
existing workloads.

On the downside, using a cloud-first strategy might cause you to miss
opportunities for your existing workloads. New workloads might constitute only a
fraction of your overall IT workload, and their impact on overall IT spending
and performance might be limited. The time you spend migrating an existing
workload might yield bigger advantages or savings than trying to accommodate a
new workload in the cloud.

Following a strict cloud-first strategy also risks increasing the
overall complexity of your IT environment. This approach might create
redundancies, lower performance due to excessive cross-environment
communication, or result in a computing environment that is not well suited for
the individual workload.

Considering these risks, you might be better off using a cloud-first approach
only for selected workloads. That way you can concentrate on workloads that
can benefit the most from a cloud deployment or migration. This approach also
takes into account the modernization of existing workloads, which is discussed
in the next section.

Migration and modernization

Hybrid/multi-cloud and IT modernization are distinct concepts that are linked in
a virtuous circle. Using the public cloud can facilitate and simplify the
modernization of IT workloads, and modernizing your IT will help you get more
from the cloud.

The primary goals of modernizing workloads are as follows:

Achieving greater agility so that you can adapt to changing
requirements.

Reducing costs of infrastructure and operations.

Increasing reliability and resiliency in order to minimize risk for the
business.

Lift and shift

Lift and shift describes the process of migrating a workload from a private
computing environment to the public cloud without changing the workload in any
significant manner. Most commonly, this process involves migrating existing
virtual machines (VMs) and their images to Compute Engine.

Running VMs in Compute Engine rather than in a private computing environment
has these benefits:

You can provision computing and storage resources quickly, avoiding
delays that are caused by procuring and installing equipment in classic
(private or on-premises) data centers.

You pay only for the compute resources that you use, with no up-front
commitment or investment.

You can automate operational tasks and reduce effort and costs as a result.

If you also then rewrite applications to become more cloud native, you can
unlock significant additional benefits:

By using autoscaling, you can ensure that computing resources are
provisioned only when they are needed, avoiding any overprovisioning costs.

You can take advantage of cluster managers such as Kubernetes to increase
the resiliency of your applications by automatically restarting them or
migrating them to different machines in case of failure.

You can further reduce the operational overhead by using managed services.

You can automate deployment, which helps accelerate product development and
release processes, which in turn can help your business react more quickly
to feedback, changing requirements, and market demands.

As this diagram shows, when you are modernizing an existing workload, consider
shifting the application to the cloud and transforming the application to
become cloud native.

Transform and move

Although it is common to shift an application to the cloud before investing in
transformation, the reverse approach might be better for some applications. The
idea of transform and move is to begin a migration by refactoring and
modernizing an application that is already in place. Even before you move the
application to the cloud, this transformation has a number of benefits:

You can improve the deployment process.

Investing in continuous integration/continuous deployment (CI/CD)
infrastructure and tooling can speed up the release cadence and shorten
feedback cycles.

After the transformation, you move the application to the cloud, which helps you
to provision resources quickly and increase cost efficiency by using autoscaling
and therefore not overprovisioning.

For transform and move to work well, consider making certain investments in
on-premises infrastructure and tooling, such as setting up a local Docker
registry and provisioning Kubernetes clusters to containerize applications.

Rip and replace

Rip and replace refers to removing a system and replacing it. In some cases,
trying to evolve an existing system and code base might not be cost effective or
even possible. Requirements might have changed substantially, or the existing
application might be based on a software or hardware stack that is not fit for
future investments. In such cases, a better approach might be to replace the
system, which might mean either purchasing a new solution or developing a modern
and cloud-native application from scratch.

Mixing and matching migration approaches

Each of the three migration approaches has certain strengths and weaknesses. A
key advantage of following a hybrid and multi-cloud strategy is that it is not
necessary to settle on a single approach. Instead, you can decide which approach
works best for each workload.

Choose lift and shift if any of the following is true of the workloads:

They have a relatively small number of dependencies on their environment.

They are not considered worth refactoring.

They are based on third-party software.

Consider transform and move for these types of workloads:

They have dependencies that must be untangled.

They rely on operating systems, hardware, or database systems that cannot
be accommodated in the cloud.

They are not making efficient use of compute or storage resources.

They cannot easily be deployed in an automated fashion.

Finally, rip and replace might be best for these types of workloads:

They no longer satisfy current requirements.

They are based on third-party technology that has reached its end of life.

They require third-party license fees that are no longer economical.

Portability

In most migrations, shifting a workload to the cloud is a one-time, irreversible
effort. But in the case of a hybrid and especially for multi-cloud scenarios,
you might want to be able to shift workloads between clouds later. To
facilitate this ability, make sure that your workloads are portable:

Make sure you can shift a workload from one computing environment to another
without significant modification.

Make sure that application deployment and management are consistent
across computing environments.

Make sure that keeping the workload portable does not conflict with the
workload being cloud native.

At the infrastructure level, you can use tools such as
Terraform
to automate and unify creation of infrastructure resources such as VMs and load
balancers in heterogeneous environments. Additionally, you can use configuration
management tools such as Ansible, Puppet, or Chef to establish a common
deployment and configuration process. Alternatively, you can use an image-baking
tool like
Packer
to create
VM images for different platforms
by using a single, shared configuration file. Finally, you can use solutions
such as Prometheus and Grafana to help ensure consistent monitoring across
environments.

Based on these tools, you can assemble a tool chain similar to the one in the
following diagram. This tool chain abstracts away the differences between
computing environments, and it enables you to unify provisioning, deployment,
management, and monitoring.

Although a common tool chain can help you achieve portability, it is subject to
several shortcomings:

You might not be able to make use of certain features that a cloud
environment offers natively. Specifically, using VMs as a common foundation
makes it difficult to implement truly cloud-native applications. Sometimes,
using VMs prevents you from using managed services, so you might miss
opportunities to reduce administrative overhead.

Building up and maintaining the tool chain incurs overhead and operational
costs.

Over time, the tool chain might grow to become complex in ways that are
unique to your company. This complexity can lead to increased training
costs.

Containers and Kubernetes

Building and maintaining a custom tool chain to achieve workload portability by
using VMs involves many challenges. One solution is to leverage containers and
Kubernetes instead.

Containers help your software to run reliably when you move it from one
environment to another. Kubernetes handles the orchestration, deployment,
scaling, and management of your containerized applications, providing the
services that form the foundation of a cloud-native application. Because you can
install and run Kubernetes on a variety of computing environments, you
can also use it to
establish a common runtime layer across computing environments:

Kubernetes provides the same services and APIs in a cloud or private
computing environment. Moreover, the level of abstraction is much higher
than when working with VMs, which generally translates into less required
groundwork and improved developer productivity.

Unlike a custom tool chain, Kubernetes is widely adopted for both
development and application management, so you can tap into existing
expertise, documentation, and third-party support.

Kubernetes uses Docker containers, an industry-adopted standard for
application packaging that is not tied to any specific vendor. Kubernetes
itself is open source and governed by the
Cloud Native Computing Foundation.

You can avoid the effort of installing and operating Kubernetes by using a
managed Kubernetes platform such as
Google Kubernetes Engine (GKE),
so operations staff can shift their focus from infrastructure to applications.
The following diagram shows what a managed Kubernetes platform might look like.

Limits to workload portability

To help make your workloads more portable, Kubernetes provides a layer of
abstraction that can hide many of the intricacies of and differences between
computing environments.That abstraction has some limitations, however:

An application might be portable to a different environment with
minimal changes, but that doesn't mean that the application will perform
equally well in both environments. Differences in underlying compute or
networking infrastructure along with proximity to dependent services might
lead to substantially different performance.

Moving a workload between computing environments might also require you
to move data. In addition to the time, effort, and budget that is needed to
copy or move data between computing environments, those environments often
differ in the services and facilities that they provide to store and manage
such data.

Kubernetes offers a unified way to provision different kinds of load
balancers. The behavior of these load balancers is not defined in detail,
however, and might differ between environments in subtle ways.

Even with Kubernetes, it can be challenging to abstract away differences between
computing environments or public clouds. Workload portability aims primarily to
simplify migrations between environments, not to automate them.

Workload assessment

When you have new projects that are in progress and hundreds or even thousands
of workloads that are already running, it can be daunting to decide which
workloads to deploy or migrate to which computing environment.

To help you make such decisions consistently and objectively, consider
categorizing and scoring workloads by opportunity, risk, and technical
difficulty.

These factors can help you evaluate migration opportunities:

Potential for market differentiation or innovation that is enabled by
using cloud services

Potential impact of outages that are caused by a migration or by the
fact that your experience with public cloud deployments might be limited at
first

The need to comply with any existing legal or regulatory restrictions

These factors can help you evaluate the technical difficulties of a migration:

Size, complexity, and age of the application

Number of dependencies with other applications

Any restrictions that third-party licenses impose

Dependencies on specific versions of operating systems, databases, or
other environment configurations

After you have assessed the initial workloads, you can begin to prioritize
workloads and identify applicable
architecture patterns
and
network topologies.
This step might require multiple iterations. Because your assessment might
change over time, it is also worth reevaluating workloads after you do your
first cloud deployments.

What's next

Learn about common
architecture patterns
for hybrid and multi-cloud, which scenarios they are best suited for, and
how to apply them.

Find out more about
network topologies
for hybrid and multi-cloud, and how to implement them.