Configure batch mode search

A search running in batch mode searches one bucket at a time in batches instead of searching through events over time. Transforming searches that qualify for batch mode processing can complete faster than they would otherwise.

Batch mode search also improves the reliability for long-running distributed searches, which can fail when an indexer goes down while the search is running. In this case, Splunk software attempts to complete the search by reconnecting to the missing peer or redistributing the search across the rest of the peers.

You can make your batch mode searches even faster by enabling batch mode search parallelization. Under batch mode search parallelization, two or more search pipelines are launched for a qualifying search, and they process the search results concurrently. See "Configure batch mode search parallelization" in this topic.

Requirements for batch mode search

Transforming searches that meet the following conditions can run in batch mode.

The searches need to use generating commands like search, loadjob, datamodel, pivot, or dbinspect.

The search can include transforming commands, like stats, chart, and so on. However the search cannot include commands like localize and transaction.

If the search is not distributed, it cannot use commands that require time-ordered events, like streamstats, head, and tail.

Configure batch mode search in limits.conf

If you have a Splunk Enterprise deployment (as opposed to Splunk Cloud), you can configure batch mode search throughout the implementation by changing settings in the limits.conf configuration file, under the [search] stanza.

When you have several batch mode search threads running concurrently, they can become a memory usage burden. You can deal with this by disabling batch mode search for your entire implementation, or by limiting the number of events that a batch mode search thread can read at once from an index bucket.

When allow_batch_mode = true, use the batch_search_max_index_values to limit the number of events read from the index file (bucket). These entries are small, approximately 72 bytes; however, batch mode is more efficient when it can read more entries at once. Defaults to 10000000 (or 10M).

For example, if your batch mode searches are causing you to run low in system memory, you can lower batch_search_max_index_values to 1000000 (1M) to decrease their memory usage. Setting this parameter to a smaller number can lead to slower search performance. You want to find a balance between efficient batch mode searching and system memory conservation.

Set search peer retry period

Other limits.conf settings control the periodicity of retries to search peers in the event of failures, such as connection errors. The interval exists between failure and first retry, as well as successive retries in the event of further failures.

Use the batch_retry_min_interval and batch_retry_max_interval parameters to specify the minimum or maximum interval (in seconds) to wait before batch mode attempts to retry the search on a failed peer. The minimum interval defaults to 5 seconds. The maximum interval defaults to 300 seconds.

After a retry attempt fails increase the time to wait before another retry by a scaling factor, batch_retry_scaling, which takes a value greater than 1.0. Defaults to 1.5.

Batch mode considers the search complete when all peers have indicated without failure that they have delivered the full answer. If the search finishes, but one or more of the peers has failed, batch mode retries connection with the failed peer(s) for the number of seconds specified by batch_wait_after_end. If batch mode cannot reconnect within this period of time, it declares the search results to be incomplete. Defaults to 900 seconds.

Search peer restart for batch mode search

If the search peer is clustered, batch mode waits for the cluster master to spawn a new generation.

If the search peer is not clustered and connection to it is lost, batch mode attempts to reconnect to it, following the retry period parameters described above. When batch mode reestablishes connection to the search peer, it resumes the batch mode search until the search completes.

Configure batch mode search parallelization

You can optionally take advantage of batch mode search parallelization to make your batch mode searches even more efficient. When you enable batch mode search parallelization, two or more search pipelines for batch search run concurrently to read from index buckets and process events. This approach improves the speed and efficiency of your batch mode searches, but at the expense of increased system memory consumption.

You can enable and configure batch mode search parallelization with an additional set of limits.conf parameters. This is an indexer-side setting. It needs to be configured on all of your indexers, not your search head(s).

Use batch_search_max_pipeline to set the number of batch mode search pipelines launched when you run a search that qualifies for batch mode. This parameter has a default value of 1. Set it to 2 or higher to parallelize batch mode searches throughout your Splunk deployment. A higher setting improves search performance at the cost of increasing thread usage and memory consumption.

The batch_search_max_results_aggregator_queue_size parameter controls the size of the results queue. The results queue is where the search pipelines leave processed search results. Its default size is 100MB. Never set it to zero.

The batch_search_max_serialized_results_queue_size parameter controls the size of the serialized results queue, from which the batch search process transmits serialized search results. Its default size is 100MB. Never set it to zero.

Enter your email address, and someone from the documentation team will respond to you:

Send me a copy of this feedback

Please provide your comments here. Ask a question or make a suggestion.

Feedback submitted, thanks!

You must be logged into splunk.com in order to post comments.
Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic.
If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk,
consider posting a question to Splunkbase Answers.

0
out of 1000 Characters

Your Comment Has Been Posted Above

We use our own and third-party cookies to provide you with a great online experience. We also use these cookies to improve our products and services, support our marketing campaigns, and advertise to you on our website and other websites. Some cookies may continue to collect information after you have left our website.
Learn more (including how to update your settings) here »