Google Cloud provides global and regional health check systems that
connect to backends on a configurable, periodic basis. Each connection attempt
is called a probe, and each health check system is called a prober.
Google Cloud records the success or failure of each probe.

Health checks and load balancers work together. Based on a configurable
number of sequential successful or failed probes, Google Cloud
computes an overall health state for each backend in the load balancer. Backends
that respond successfully for the configured number of times are considered
healthy. Backends that fail to respond successfully for a separate number of
times are unhealthy.

Google Cloud uses the overall health state of each backend to determine
its eligibility for receiving new requests or connections. In addition to being
able to configure probe frequency and health state thresholds, you can configure
the criteria that define a successful probe. This is discussed in detail
in the section How health checks work.

Health check categories, protocols, and ports

Google Cloud organizes health checks by category and protocol.

The two categories are health checks and legacy health
checks. Each category supports a different set of protocols and a means for
specifying the port used for health checking. The protocol and port determine
how Google Cloud health check systems contact your backends. For example,
you can create a health check that uses the HTTP protocol on TCP port 80, or you
can create a health check that uses the TCP protocol for a named port
configured on an instance group.

Most Google Cloud load balancers require non-legacy health checks, but
Network Load Balancing requires legacy health checks that use the HTTP
protocol. You can find specific guidance about selecting the category,
protocol, and ports in the next section, "Selecting a health check."

You cannot convert a legacy health check to a health check, and you cannot
convert a health check to a legacy health check.

Note: The term health check does not refer to legacy health checks. In this
document, legacy health checks are explicitly called legacy health checks.

Selecting a health check

Health checks must be compatible with the type of load balancer and the types
of backends (instance groups or zonal NEGs) that it uses. The three
factors that you must specify when you create a health check are as follows:

Category: health check or legacy health check, which must be compatible
with the load balancer.

Port specification: defines which ports are used for the health check's
protocol.

The guide at the end of this section summarizes valid combinations
of health check category, protocol, and port specification based on a given
type of load balancer and backend type.

For information about the types of health checks supported by various load
balancers, see Health checks.

Note: As used in this section, the term instance group refers to unmanaged
instance groups, managed zonal instance groups, or managed regional instance
groups.

Category and protocol

The type of load balancer and the types of backends that the load balancer uses
determine the health check's category. Network Load Balancing requires
legacy health checks that use the HTTP protocol. All other load balancer
types use regular health checks.

You must select a protocol from the list of protocols supported by the health
check's category. It's a best practice to use the same protocol as the load
balancer; however, this is not a requirement, nor is it always possible. For
example, network load balancers require legacy health checks, and they require that
the legacy health checks use the HTTP protocol, despite the fact that
Network Load Balancing supports TCP and UDP in general. For
network load balancers, you must run an HTTP server on your virtual machine (VM)
instances so that they can respond to health check probes.

The following table lists the health check categories and the protocols that each
category supports.

Health check category

Supported protocols

Health check

• HTTP
• HTTPS
• HTTP/2 (with TLS)
• SSL
• TCP

Legacy health check

• HTTP
• HTTPS

Legacy HTTPS health checks are not supported for network load balancers and
cannot be used with most other types of load balancers.

Category and port specification

In addition to a protocol, you must select a port specification for your health
check. Health checks provide three port specification methods, and legacy health
checks provide one method. Not all port specification methods are applicable to
each type of load balancer. The type of load balancer and the types of backends
that the load balancer uses determine which port specification method you can
use.

Health check category

Port specification methods and meanings

Health check

--port: specify a TCP port number--use-serving-port:
• for instance groups, use the named port that the backend
service subscribes to
• for zonal NEGs, use the port defined on each endpoint

Legacy health check

--port: specify a TCP port number

Load balancer guide

This table shows the supported category, scope, and port specification for each
Google Cloud load balancer and backend type.

1 You cannot use the
--use-serving-port flag because internal backend services do not have an
associated named port.

2 It is possible, but not
recommended, to use a legacy health check for backend services associated with
external HTTP(S) load balancers under the following circumstances:

The backends are instance groups, not zonal NEGs.

The backend VMs serve traffic that uses either HTTP or HTTPS protocols.

How health checks work

The following sections describe how health checks work.

Probes

When you create a health check or create
a legacy health check, you specify the
following flags or accept their default values. Each health check or
legacy health check that you create is implemented by multiple
probes. These flags control how frequently each
Google Cloud health check probe evaluates instances in instance group
backends or endpoints in zonal NEG backends.

A health check's settings cannot be configured on a per-backend basis.
Health checks are associated with an entire backend service, and legacy health
checks are associated with either an entire target pool
(for Network Load Balancing) or backend service (for certain external
HTTP(S) Load Balancing configurations). Thus, the parameters for the probe
are the same for all backends referenced by a given backend service or target
pool.

Configuration flag

Purpose

Default value

Check intervalcheck-interval

The check interval is the amount of time from the start of one probe
issued by one prober to the start of the next probe
issued by the same prober. Units are seconds.

If omitted, Google Cloud uses 5s (5 seconds).

Timeouttimeout

The timeout is the amount of time that Google Cloud waits
for a response to a probe. Its value must be less than or equal to the
check interval. Units are seconds.

If omitted, Google Cloud uses 5s (5 seconds).

Probe IP ranges and firewall rules

For health checks to work, you must create ingress allow firewall rules so
that traffic from Google Cloud probers can connect to your backends.

Importance of firewall rules

Google Cloud requires that you create the necessary ingress allow
firewall rules to permit traffic from probers to your backends. As a best
practice, limit these rules to just the protocols and ports that
match those used by your health checks. For the source IP ranges, make sure to
use the documented probe IP ranges listed in the preceding section.

If you do not have ingress allow firewall rules that permit the protocol,
port, and source IP range used by your health check, the implied deny ingress
firewall rule blocks inbound
traffic from all sources. When probers can't contact your backends, the
Google Cloud load balancer categorizes all of your backends as unhealthy.
The behavior when all backends are unhealthy depends on the type of load
balancer:

An external HTTP(S) load balancer returns HTTP 502 responses to clients when all backends
are unhealthy.

An internal HTTP(S) load balancer returns HTTP 503 responses to clients when all backends
are unhealthy.

SSL proxy load balancers and TCP proxy load balancers time out when all backends are unhealthy.

An internal TCP/UDP load balancer without failover configured distributes traffic
to all backend VMs when they are all unhealthy as a last resort. You
can disable this
behavior if you enable
failover.

Security considerations for probe IP ranges

Consider the following information when planning health checks and the necessary
firewall rules:

The probe IP ranges belong to Google. Google Cloud uses special
routes outside of your VPC
network but within Google's production network to facilitate communication
from probers.

Google uses the probe IP ranges exclusively to execute health check probes and
to send traffic from Google Front Ends (GFEs) for external HTTP(S) load balancers,
SSL proxy load balancers, and TCP proxy load balancers. If a packet is received from the
internet, including the external IP address of a Compute Engine
instance or a Google Kubernetes Engine (GKE) node, and the packet's source IP
address is within a probe IP range, Google drops the packet.

The probe IP ranges are a complete set of possible IP addresses used by
Google Cloud probers. If you use tcpdump or a similar tool, you
might not observe traffic from all IP addresses in all of the probe IP ranges.
As a best practice, create ingress allow firewall rules for your chosen load
balancer by using all of the probe IP ranges as sources because
Google Cloud can implement new probers automatically without
notification.

Multiple probes and frequency

Google Cloud sends health check probes from multiple redundant systems
called probers. Probers use specific source IP ranges.
Google Cloud does not rely on just one prober to implement a health
check—multiple probers simultaneously evaluate the instances in instance
group backends or the endpoints in zonal NEG backends. If one prober fails,
Google Cloud continues to track backend health states.

The interval and timeout settings that you configure for a health
check are applied to each prober. For a given backend, software access logs and
tcpdump show more frequent probes than your configured settings.

This is expected behavior, and you cannot configure the number of probers
that Google Cloud uses for health checks. However, you can estimate the
effect of multiple simultaneous probes by considering the following factors.

To estimate the probe frequency per backend service, consider the following:

Base frequency per backend service. Each health check has an
associated check frequency, inversely proportional to the configured
check interval:

1⁄(check interval)

When you associate a health check with a backend service, you establish
a base frequency used by each prober for backends on that backend
service.

Probe scale factor. The backend service's base frequency is multiplied
by the number of simultaneous probers that Google Cloud uses. This
number can vary, but is generally between 5 and 10.

Multiple forwarding rules for internal TCP/UDP load balancers. If you
have configured multiple internal forwarding rules (each having a different IP
address) pointing to the same regional internal backend service,
Google Cloud uses multiple probers to check each IP address. The probe
frequency per backend service is multiplied by the number of configured
forwarding rules.

Multiple forwarding rules for network load balancers. If you have
configured multiple forwarding rules that point to the same target pool,
Google Cloud uses multiple probers to check each IP address. The
probe frequency as seen by each instance in the target pool is multiplied by
the number of configured forwarding rules.

Multiple target proxies for external HTTP(S) load balancers.
If you have configured multiple target proxies that direct traffic to the same
URL map for external HTTP(S) Load Balancing, Google Cloud uses
multiple probers to check the IP address associated with each target proxy.
The probe frequency per backend service is multiplied by the number of
configured target proxies.

Multiple target proxies for SSL proxy load balancers and TCP proxy load balancers.
If you have configured multiple target proxies that direct traffic to the same
backend service for SSL Proxy Load Balancing or TCP Proxy Load Balancing,
Google Cloud uses multiple probers to check the IP address associated
with each target proxy. The probe frequency per backend service is multiplied
by the number of configured target proxies.

Sum over backend services. If a backend (such as an instance group) is
used by multiple backend services, the backend instances are contacted as
frequently as the sum of frequencies for each backend service's health check.

With zonal NEG backends, it's more difficult to determine the exact number of
health check probes. For example, the same endpoint can be in multiple zonal
NEGs, where those zonal NEGs don't necessarily have the same set of endpoints,
and different endpoints can point to the same backend.

Destination for probe packets

The following table shows what network interface and destination IP addresses
are used by health check probers, depending on the type of load balancer.

Load balancer

Destination network interface

Destination IP address

Internal TCP/UDP Load Balancing

The network interface of the instance located in the network specified
for the internal backend service. If not specified, the primary network
interface (nic0) is used.

If multiple
forwarding rules point to the same backend service, Google Cloud
sends probes to each forwarding rule's IP address. This can result in an
increase in the number of probes.

Network Load Balancing

Primary network interface (nic0)

The IP address of the external forwarding rule.

If multiple
forwarding rules point to the same target pool, Google Cloud sends
probes to each forwarding rule's IP address. This can result in an
increase in the number of probes.

External HTTP(S) Load Balancing

Internal HTTP(S) Load Balancing

SSL Proxy Load Balancing

TCP Proxy Load Balancing

Primary network interface (nic0)

For instance group backends, the primary internal IP address
associated with the primary network interface (nic0) of
each instance.

For zonal NEG backends, the IP address of the endpoint, which is either
a primary internal IP address or an alias IP range (of the primary
network interface, nic0, on the instance hosting the
endpoint).

Success criteria for HTTP, HTTPS, and HTTP/2

When a health check uses the HTTP, HTTPS, or HTTP/2 protocol, each probe
requires an HTTP 200 (OK) response code to be delivered before the probe
timeout. In addition, you can do the following:

You can configure Google Cloud probers to send HTTP requests to a
specific request path. If you don't specify a request path, / is used.

If you configure a
content-based health check by specifying
an expected response string, Google Cloud must find the expected string
within the first 1,024 bytes of the HTTP response.

If you configure an expected response string, each Google Cloud health
check probe must find the expected response string within the first 1,024
bytes of the actual response from your backends.

The following combinations of request path and response string flags are
available for health checks that use HTTP, HTTPS, and HTTP/2 protocols.

The optional response flag allows you to configure a content-based
health check. The expected response string must be less than or equal
to 1,024 ASCII (single byte) characters. When configured,
Google Cloud expects this string within the first 1,024 bytes of the
response in addition to receiving HTTP 200 (OK)
status.

Success criteria for SSL and TCP

Unless you specify an expected response string, probes for health checks that
use the SSL and TCP protocols are successful when both of the following base
conditions are true:

Each Google Cloud prober is able to successfully complete an
SSL or TCP handshake before the configured probe timeout.

For TCP health checks, the TCP session is terminated gracefully by either your
backend or the Google Cloud prober, or your backend sends a TCP
RST (reset) packet while the TCP session to the prober is still
established.

Be aware that if your backend sends a TCP RST (reset) packet to close a TCP
session for a TCP health check, after the Google Cloud prober initiates a
graceful TCP termination, the probe might be considered unsuccessful.

You can create a content-based health check if you provide a request string
and an expected response string, each up to 1,024 ASCII (single byte) characters
in length. When an expected response string is configured, Google Cloud
considers a probe successful only if the base conditions are satisfied and
the response string returned exactly matches the expected response string.

The following combinations of request and response flags are available for
health checks that use the SSL and TCP protocols.

Configuration flags

Success criteria

Neither request nor response specified

Neither flag specified: --request, --response

Google Cloud considers the probe successful when the base
conditions are satisfied.

Google Cloud waits for the expected response string, and considers
the probe successful when the base conditions are satisfied and the
response string returned exactly matches the expected response
string.

You should only use --response by itself if your
backends would automatically send a response string as part of the TCP or
SSL handshake.

Only request specified

Flags specified: only --request

Google Cloud sends your configured request string and considers the
probe successful when the base conditions are satisfied. The response, if
any, is not checked.

Health state

Google Cloud uses the following configuration flags to determine the
overall health state of each backend being load balanced.

Configuration flag

Purpose

Default value

Healthy thresholdhealthy-threshold

The healthy threshold specifies the number of sequential successful
probe results for a backend to be considered healthy.

If omitted, Google Cloud uses a threshold of 2
probes.

Unhealthy thresholdunhealthy-threshold

The unhealthy threshold specifies the number of sequential failed probe
results for a backend to be considered unhealthy.

If omitted, Google Cloud uses a threshold of 2
probes.

Google Cloud considers backends to be healthy after this healthy
threshold has been met. Healthy backends are eligible to receive new connections.

Google Cloud considers backends to be unhealthy when the unhealthy
threshold has been met. Unhealthy backends are not eligible to receive new
connections; however, existing connections are not immediately terminated.
Instead, the connection remains open until a timeout occurs or until traffic is
dropped. The specific behavior differs depending on the type of load balancer
that you're using.

Existing connections might fail to return responses, depending on the cause for
failing the probe. An unhealthy backend can become healthy if it is able to meet
the healthy threshold again.

Additional notes

Content-based health checks

A content-based health check is one whose success criteria depends on evaluation
of an expected response string. Use a content-based health check to instruct
Google Cloud health check probes to more completely validate your
backend's response.

You configure an HTTP, HTTPS, or HTTP/2 content-based health check by
specifying an expected response string, and optionally by defining a request
path. For more details, see Success criteria for HTTP, HTTPS, and
HTTP/2.

You configure an SSL or TCP content-based health check by specifying an
expected response string, and optionally by defining a request string. For more
details, see Success criteria for SSL and TCP.

Certificates and health checks

Google Cloud health check probers do not perform certificate validation,
even for protocols that require that your backends use certificates (SSL, HTTPS,
and HTTP/2)—for example:

You can use self-signed certificates or certificates signed by any certificate
authority (CA).

Certificates that have expired or that are not yet valid are acceptable.

Headers

Health checks that use any protocol, but not legacy health checks, allow you to
set a proxy header by using the --proxy-header flag.

Health checks that use HTTP, HTTPS, or HTTP/2 protocols and legacy health
checks allow you to specify an HTTP Host header by using the --host flag.

Example health check

Suppose you set up a health check with the following settings:

Interval: 30 seconds

Timeout: 5 seconds

Protocol: HTTP

Unhealthy threshold: 2 (default)

Healthy threshold: 2 (default)

With these settings, the health check behaves as follows:

Multiple redundant systems are simultaneously configured with the
health check parameters. Interval and timeout settings are applied to each
system. For more information, see
Multiple probes and
frequency.

Each health check prober does the following:

Initiates an HTTP connection from one of the source IP
addresses to the backend instance every 30 seconds.

Waits up to five seconds for an HTTP 200 (OK) response code (the success
criteria for HTTP, HTTPS, and HTTP/2
protocols).

A backend is considered unhealthy when at least one health check probe
system does the following:

Does not receive an HTTP 200 (OK) response code for two
consecutive probes. For example, the connection might be refused, or
there might be a connection or socket timeout.

A backend is considered healthy when at least one health check probe
system receives two consecutive responses that match the protocol-specific
success criteria.

In this example, each prober initiates a connection every 30 seconds. Thirty
seconds elapses between a prober's connection attempts regardless of the
duration of the timeout (whether or not the connection timed out). In other
words, the timeout must always be less than or equal to the interval, and
the timeout never increases the interval.

In this example, each prober's timing looks like the following, in seconds: