anomalydetection

Description

A streaming and reporting command that identifies anomalous events by computing a probability for each event and then detecting unusually small probabilities. The probability is defined as the product of the frequencies of each individual field value in the event.

For categorical fields, the frequency of a value X is the number of times X occurs divided by the total number of events.

For numerical fields, we first build a histogram for all the values, then compute the frequency of a value X as the size of the bin that contains X divided by the number of events.

The anomalydetection command includes the capabilities of the existing anomalousvalue and outlier commands and offers a histogram-based approach for detecting anomalies.

Description: The actions and defaults depend on the method that you specify. See the detailed descriptions for the actions for each method below.

<pthresh-option>

Syntax: pthresh=<num>

Description: Used with method=histogram or method=zscore. Sets the probability threshold, as a decimal number, that has to be met for an event to be deemed anomalous.

Default: For method=histogram, the command calculates pthresh for each data set during analysis. For method=zscore, the default is 0.01. If you try to use this when method=iqr, it returns an invalid argument error.

<cutoff-option>

Syntax: cutoff=<bool>

Description: Sets the upper bound threshold on the number of anomalies. This option applies to only the histogram method. If cutoff=false, the algorithm uses the formula threshold = 1st-quartile - 1.5 * IRQ without modification. If cutoff=true, the algorithm modifies the formula in order to come up with a smaller number of anomalies.

Default: true

<field-list>

Syntax: <string> <string> ...

Description: A list of field names.

Histogram actions

<action-option>

Syntax: action=annotate | filter | summary

Description: Specifies whether to return all events with additional fields (annotate), to filter out events with anomalous values (filter), or to return a summary of anomaly statistics (summary).

Default: filter

When action=filter, the command returns anomalous events and filters out other events. Each returned event contains four new fields. When action=annotate, the command returns all the original events with the same four new fields added when action=filter.

Field

Description

log_event_prob

The natural logarithm of the event probability.

probable_cause

The name of the field that best explains why the event is anomalous. No one field causes anomaly by itself, but often some field value occurs too rarely to make the event probability small.

probable_cause_freq

The frequency of the value in the probable_cause field.

max_freq

Maximum frequency for all field values in the event.

When action=summary, the command returns a single event containing six fields.

Output field

Description

num_anomalies

The number of anomalous events.

thresh

The event probability threshold that separates anomalous events.

max_logprob

The maximum of all log(event_prob).

min_logprob

The minimum of all log(event_prob).

1st_quartile

The first quartile of all log(event_prob).

3rd_quartile

The third quartile of all log(event_prob).

Zscore actions

<action-option>

Syntax: action=annotate | filter | summary

Description: Specifies whether to return the anomaly score (annotate), filter out events with anomalous values (filter), or a summary of anomaly statistics (summary).

Default:filter

When action=filter, the command returns events with anomalous values while other events are dropped. The kept events are annotated, like the annotate action.

When action=annotate, the command adds new fields, Anomaly_Score_Cat(field) and Anomaly_Score_Num(field), to the events that contain anomalous values.

When action=summary, the command returns a table that summarizes the anomaly statistics for each field is generated. The table includes how many events contained this field, the fraction of events that were anomalous, what type of test (categorical or numerical) were performed, and so on.

IQR actions

<action-option>

Syntax: action=remove | transform

Description: Specifies what to do with outliers. The remove action removes the event containing the outlying numerical value. The transform action transforms the event by truncating the outlying value to the threshold for outliers. If mark=true, the transform action prefixes the value with "000".

Abbreviations: The abbreviation for remove is rm. The abbreviation for transform is tf.

Default:action=transform

Usage

The zscore method

When you specify method=zscore, the anomalydetection command performs like the anomalousvalue command. You can specify the syntax components of the anomalousvalue command when you use the anomalydetection command with method=zscore. See the anomalousvalue command.

The iqr method

When you specify method=iqr, the anomalydetection command performs like the outlier command. You can specify the syntax components of the outlier command when you specify method=iqr with the anomalydetection command.
For example, you can specify the outlier options <action>, <mark>, <param>, and <uselower>. See the outlier command.

Examples

Example 1: Return only anomalous events

These two searches return the same results. The arguments specified in the second search are the default values.

... | anomalydetection

... | anomalydetection method=histogram action=filter

Example 2: Return a short summary of how many anomalous events are there

Return a short summary of how many anomalous events are there and some other statistics such as the threshold value used to detect them.

... | anomalydetection action=summary

Example 3: Return events with anomalous values

This example specifies method=zscore to return anomalous values. The search uses the filter action to filter out events that do not have anomalous values. Events must meet the probability threshold pthresh before being considered an anomalous value.

... | anomalydetection method=zscore action=filter pthresh=0.05

Example 4: Return outliers

This example uses the outlier options from the outlier command. The abbreviation tf is used for the transform action in this example.

Enter your email address, and someone from the documentation team will respond to you:

Send me a copy of this feedback

Please provide your comments here. Ask a question or make a suggestion.

Feedback submitted, thanks!

You must be logged into splunk.com in order to post comments.
Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic.
If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk,
consider posting a question to Splunkbase Answers.

0
out of 1000 Characters

Your Comment Has Been Posted Above

We use our own and third-party cookies to provide you with a great online experience. We also use these cookies to improve our products and services, support our marketing campaigns, and advertise to you on our website and other websites. Some cookies may continue to collect information after you have left our website.
Learn more (including how to update your settings) here »