Use persistent queues to help prevent data loss

By default, forwarders and indexers have an in-memory input queue of 500KB. If the input stream runs at a faster rate than the forwarder or indexer can process, to a point where the queue is maxed out, undesired consequences occur. In the case of UDP, data drops off the queue and gets lost. For other input types, the application generating the data gets backed up.

By implementing persistent queues, you can help prevent this from happening. With persistent queuing, once the in-memory queue is full, the forwarder or indexer writes the input stream to files on disk. It then processes data from the queues (in-memory and disk) until it reaches the point when it can again start processing directly from the data stream.

Note: While persistent queues help prevent data loss if processing gets backed up, you can still lose data if Splunk software crashes. For example, Splunk software holds some input data in the in-memory queue as well as in the persistent queue files. The in-memory data can get lost if a crash occurs. Similarly, data that is in the parsing or indexing pipeline but that has not yet been written to disk can get lost in the event of a crash.

When can you use persistent queues?

Persistent queuing is available for certain types of inputs, but not all. Generally speaking, it is available for inputs of an ephemeral nature, such as network inputs, but not for inputs that have their own form of persistence, such as file monitoring.

Comments

This page states that persistentQueueSize is available for Windows Event Log inputs, however, this page, nor the inputs.conf page say how this is set up.

Cboillot

June 7, 2018

It would be nice if this page also explained what will happen if a persistent queue fills up. Specifically, I am building a modular input and I wonder whether or not the Splunk code on the reading end of my XML output stream will stop reading so that I might block on writing in this condition. Or...will Splunk code keep reading and just silently drop bytes. If so, which bytes? Newest, oldest, undefined?

"For example, Splunk software holds some input data in the in-memory queue as well as in the persistent queue files." - Does this mean that if the forwarder/system is shutdown cleanly that the 500K of cached events will get written to persistent queue? Is the persistent queue maintained through a reboot?

Use persistent queues to help prevent data loss

Enter your email address, and someone from the documentation team will respond to you:

Send me a copy of this feedback

Please provide your comments here. Ask a question or make a suggestion.

Feedback submitted, thanks!

You must be logged into splunk.com in order to post comments.
Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic.
If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk,
consider posting a question to Splunkbase Answers.

0
out of 1000 Characters

Your Comment Has Been Posted Above

We use our own and third-party cookies to provide you with a great online experience. We also use these cookies to improve our products and services, support our marketing campaigns, and advertise to you on our website and other websites. Some cookies may continue to collect information after you have left our website.
Learn more (including how to update your settings) here »