The only way to fully take advantage of these capabilities is by using Elasticsearch monitoring tools that give you deep visibility into your Elasticsearch environment. The right Elasticsearch monitoring tools can turn data into actionable insights. Monitoring the performance of your Elasticsearch environment with the latest aggregated data helps you stay up-to-date on the internal components of your working cluster. When it comes to Elasticsearch monitoring, there are tons of metrics to consider—here, we’ll take a closer look at four important metrics you should keep on your radar.

While running indexing benchmarks, a fixed number of records are used to calculate the indexing rate. When the workload is write-heavy, updating indices with new information makes monitoring and analyzing Elasticsearch performance easier. Sudden spikes and dips in indexing rates could indicate issues with data sources. The overall cluster performance can be affected by refresh time and merge time.

Reduced refresh times and quick merge times are usually preferred. Optimal Elasticsearch performance monitoring tools will help you monitor the average query latency for every node including start time, average segment time in node, file system cache usage, and request rates as well as help you configure actions if thresholds are violated.

3. Search performance metric

Apart from index requests, another important request is the search request. Here are some important search performance metrics to consider while performing Elasticsearch monitoring:

Query latency and request rate: There are numerous things that can affect your query performance such as poorly constructed queries, improperly configured Elasticsearch clusters, JVM memory, garbage collection issues, etc. Without a doubt, query latency is a metric that has a direct impact on users, and it’s essential that you receive alerts when there’s an anomaly. Tracking the request rate along with query latency provides an overview of how much a system is used.

Filter cache: Filters in Elasticsearch are cached by default. While executing a query with a filter, Elasticsearch will find documents matching the filter and build a structure called a bitset using that information. If subsequent query executions have the same filter, then the information stored in the bitset will be reused, making the query execution faster by saving I/O operations and CPU cycles.

4. Network and thread pool monitoring

Elasticsearch nodes use thread pools to manage thread memory and CPU consumption. Thread pools are automatically configured based on the number of processors. Important thread pools to monitor include: search, index, merge, and bulk. Thread pool issues can be caused by a large number of pending requests or a single slow node as well as a thread pool rejection. A drastic change in memory usage or long garbage collection may indicate a critical situation.

Too much garbage collection activity can happen for two reasons:

One particular pool is stressed.

The JVM needs more memory than what has been allocated to it.

To avoid spikes in your thread pool, be prepared for thread pool issues that were caused due to pending requests, a single slow node, or thread pool rejections on your indexing queue.