Tag Archives: SPRINGBOOT

In this article, i will talk about the Running Instance Health, what can represent the Health, how can we detect the health and how can we use this health information to make the System resilient.

Health, basically, defines how well an instance is responding. Health can be:

UP

DOWN

REAL LIFE PROBLEM
Imagine you reach a Bank and found it being closed. Or, Imagine you are standing in a bank counter queue and waiting to be served. By the time your turn arrives, person sitting at a counter goes away. May be that person is not feeling well.

How would you feel in such a situation? Irritated? Frustrated?
What if you would have been told upfront about this situation? Your time would not have wasted. You would not have felt bad.

But what if someone else takes a job of that counter and start serving you.

Now, imagine a pool of servers hosting a site which allows you to upload a video, say http://www.Youtube.com. You are trying to upload a small video of yours on a site and every time you try to upload, you get some error after sometime and video could not be uploaded.

Basically, Software Applications like http://www.youtube.com run on machines – be it physical or virtual in order to get desired results. Executing these applications require machine’s local resources like memory, cpu, network, disk etc or other external dependencies to get things done.
These resources are limited and executing multiple tasks concurrently put a risk of contention and exhaustion.
It may happen that enough resources are not available for execution and thus the task execution will eventually fail.

In order to make the system Resilient, one of the things that can be done is Proactively determine the Health Status and report it – to LoadBalancer or to Service Discoverers etc whenever asked, to prevent or deal with the failures.

Reporting a health Status with proper Http Status Codes like 200 for UP and 500 for DOWN can be quite useful.

WHAT CAN DEFINE INSTANCE\PROCESS HEALTH?
Below is a list of some common metrics that can be useful in detecting the health of an instance:

Pending Requests

Container Level

Message Level

Latency Overhead – Defined as the TP99 latency added by this application/layer

TP99 or TP95 or TP75 as per your Service SLAs

Resources

% Memory Utilization – Leading towards OOM

% CPU Utilization

Host Level

Process Level

Number of Threads

Any Business KPI

External Dependencies Failures optioanlly

Identifying a list of above criterias is important as well as choosing the correct Threshold or Saturation Values as well.
Too low values or high values can result into system unreliability.

WHY IS IT IMPORTANT?

System is usually expected to be highly available and reliable. High Availability can be achieved through Redundancy where in multiple server instances are running in parallel, processing the requests and thus the demand.

What if One or more instances are running out of resources and thus not able to meet the demand.

Detecting such a state at an appropriate time and taking an action can help in achieving High Availability and Reliability of the System.

It helps in making the system resilient against failures.

ACTIONS ON DETECTING UNHEALTHY

REPLENISH thru REBOOT: If you have limited servers pool capacity and cannot increase the capacity, the unhealthy machine has to be restarted\rebooted in order to get it back to healthy state.

REPLACE: If you have unlimited server capacity or using Cloud Computing Framework – AWS, Azure, Google Cloud etc, rather than rebooting the machine, you have an option of starting a new machine and killing and removing the old unhealthy machine from processing the requests.

Once an instance is detected unhealthy, instance shall be replenished or replaced.
Either that unhealthy instance shall be rebooted to get it to Healthy state or be replaced with a new server which is put behind LoadBalancer and old being removed from LoadBalancer.

Spring Boot includes a number of built-in endpoints.
One of the endpoints is the health endpoint which provides basic application health information.
By default, the health endpoint is mapped to /health

On invoking this endpoint, Health information is collected from all HealthIndicator beans defined in your
ApplicationContext and based on Health Status returned by these HealthIndicators, Aggregated Health Status is returned.

Spring Boot includes a number of auto-configured HealthIndicators and allows to write our own.

Since we keep track of certain metrics in our applications, we wanted an ability to evaluate Health based on certain
Metrics’ values. For e.g., if Number of Thread exceed ‘n’, Health shall be reported as DOWN

For this purpose, CompositeMetricBasedHealthEvaluator is implemented.
It relies on either MetricReaders or PublicMetrics to get the Metrics’s current values and evaluate the
Health accordingly.

It reports the Individual Health of all configured Health indicator Criterias and reports Health as DOWN If any of
them is Down.

For Unavailable Metric, Health cannot be determined and thus reported as UNKNOWN for that specific metric.

With the above configuration, 2 Criterias are defined and **HealthCriteriaList** object gets instantiated using
Configuration Annotation.

Here, Thread Criteria specifies that for Health to be **UP**, number of threads < 100.
If NumberOfThreads >= 100, Health will be reported as **DOWN**

Likewise, more criterias can be defined.

Note that
* **metricName** can contain ‘.’ character as well.
* **thresholdOrSaturationLevel** can have any Valid Number, be it Integer or Decimal Number
* **operator** can be any valid value from ComparisonOperator enum.

The below configuration instantiates MetricBasedSpringBootAdapter with MetricReaders only.
Both Parameters, healthCriteriaList and metricReaderList are injected automatically through Spring application
context. This happens due to auto configuration.

The above configuration can be useful wherein MetricReader is not available to read the Metric but Metric is
available publicly through PublicMetrics interface.
With the above configuration, all parameters are injected automatically by Spring.

Things to Note
* Name of Bean minus Suffix HealthIndicator (metricBased) is what is reported as HealthIndicator Name.
* AutoConfiguration of MetricReaders, PublicMetrics or Configuration could be disabled. If this is the case, either
enable AutoConfiguration or manually instantiate MetricReaders, PublicMetrics etc
* PublicMetrics interface can be expensive depending upon the number of metrics being maintained. Use it only if
Custom MetricReader cannot be written or Metrics are small in number.

Blog Stats

Categories

Shortcuts

Advertise YOUR BUSINESS Here

Do You want to advertise your Business here? Contact me @ dem.street@gmail.com or leave a message here. Note that this is an informative site with traffic coming from different websites. No Requests from Irrelevant sites, not suitable as per laws or obscenity etc will be entertained