Splunk index buckets go through four stages of retirement. When indexed data reaches a frozen state, Splunk removes it from the index. You can configure Splunk to archive the data at this point, instead of deleting it entirely.

Retirement stage

Description

Searchable?

Hot

New buckets, open for writing.

There are one or more hot buckets for each index.

Yes

Warm

Data rolled from hot.

There are many warm buckets.

Yes

Cold

Data rolled from warm.

There are many cold buckets.

Yes

Frozen

Data rolled from cold.

At the point of rolling to frozen, the data is either deleted (default) or moved to a frozen archive.

No

You configure the sizes, locations, and ages of indexes and their buckets by editing indexes.conf.

To make changes to indexes.conf, edit a copy in $SPLUNK_HOME/etc/system/local/ or in a custom app directory in $SPLUNK_HOME/etc/apps/. Do not edit the copy in $SPLUNK_HOME/etc/system/default. For information on configuration files and directory locations, see "About configuration files".

Note: All index locations must be writable.

Caution: When you change your data retirement and archiving policy settings, Splunk can delete old data without prompting you.

Remove data when an index grows too large

If an index grows larger than its maximum size, the oldest data is rolled to frozen and removed from the index. By default, frozen data is deleted entirely, although you can specify that Splunk archive it instead, as described in "Archive indexed data".

The default maximum size for an index is 500,000MB. To change the maximum size, edit the maxTotalDataSizeMB attribute in indexes.conf. For example, to specify the maximum size as 250,000MB:

[main]
maxTotalDataSizeMB = 250000

Important: Specify the size in megabytes.

Restart Splunk for the new setting to take effect. Depending on how much data there is to process, it can take some time for Splunk to begin to move buckets out of the index to conform to the new policy. You might see high CPU usage during this time.

Remove data when it ages

Splunk ages out data by buckets. Specifically, when the most recent data in a particular bucket reaches the configured age, the entire bucket is rolled.

To freeze data beyond a specified age, set frozenTimePeriodInSecs in indexes.conf to the number of seconds to elapse before the data gets frozen. The default value is 188697600 seconds, or approximately 6 years. This example configures Splunk to cull old events from its index when they become more than 180 days (15552000 seconds) old:

[main]
frozenTimePeriodInSecs = 15552000

Important: Specify the time in seconds.

Restart Splunk for the new setting to take effect. Depending on how much data there is to process, it can take some time for Splunk to begin to move buckets out of the index to conform to the new policy. You might see high CPU usage during this time.

Archive data

If you want to archive frozen data instead of deleting it entirely, you must tell Splunk to do so, as described in "Archive indexed data". You can create your own archiving script or you can just let Splunk handle the archiving for you. You can later restore ("thaw") the archived data, as described in "Restore archived data".

Other ways that buckets age

There are a number of other conditions that can cause buckets to roll from one stage to another, some of which can also trigger deletion or archiving. These are all configurable, as described in "How Splunk stores indexes". For a full understanding of all your options for controlling retirement policy, read that topic.

For example, Splunk rolls buckets when they reach their maximum size. If you are indexing a large volume of events, bucket size is less a concern for retirement policy because the buckets will fill quickly. You can reduce bucket size by setting a smaller maxDataSize in indexes.conf so they roll faster. But note that it takes longer to search more small buckets than fewer large buckets. To get the results you are after, you will have to experiment a bit to determine the right size for your buckets.

Troubleshoot the archive policy

I ran out of disk space so I changed the archive policy, but it's still not working

If you changed your archive policy to be more restrictive because you've run out of disk space, you may notice that events haven't started being archived according to your new policy. This is most likely because you must first free up some space so the process has room to run. Stop Splunk, clear out ~5GB of disk space, and then start Splunk again. After a while (exactly how long depends on how much data there is to process) you should see INFO entries about BucketMover in splunkd.log showing that buckets are being archived.

Comments

@Delink:<br /><br />I have exactly that same question: how do I safely delete data from the cold buckets to free up space?

Dalax

July 6, 2011

There are a number of configuration possibilities that you can use to specify the maximum size of indexes, as well as rolling behavior for buckets. These settings are described here:<br /><br />http://www.splunk.com/base/Documentation/latest/Admin/HowSplunkstoresindexes<br /><br />Depending on your specific needs, you could, for example, reduce the size of the maxTotalDataSizeMB attribute, which would cause cold buckets to roll to frozen more quickly. By default, frozen buckets get deleted.<br /><br />There are alternative settings that might better fit the needs of your particular environment. They're described in the same topic.

Sgoodman

June 22, 2011

"Stop Splunk, clear out ~5GB of disk space, and then start Splunk again."<br /><br />How might one go about removing 5GB from the partition Splunk's data lives on if that is the only thing there? Can I somehow freeze and delete old buckets manually?

Enter your email address, and someone from the documentation team will respond to you:

Send me a copy of this feedback

Please provide your comments here. Ask a question or make a suggestion.

Feedback submitted, thanks!

You must be logged into splunk.com in order to post comments.
Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic.
If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk,
consider posting a question to Splunkbase Answers.

0
out of 1000 Characters

Your Comment Has Been Posted Above

We use our own and third-party cookies to provide you with a great online experience. We also use these cookies to improve our products and services, support our marketing campaigns, and advertise to you on our website and other websites. Some cookies may continue to collect information after you have left our website.
Learn more (including how to update your settings) here »