Required arguments

<stats-func>...

Syntax: count(<field>) | <function>(<field>) [AS <string>]

Description: Either perform a basic count of a field or perform a function on a field. For a list of the supported functions for the tstats command, refer to the table below. You can specify one or more functions. You can also rename the result using the AS keyword, unless you are in prestats mode. You cannot use wildcards to specify field names. See Usage. For a list of the supported functions for the tstats command, refer to the table below.

The following table lists the supported functions by type of function. Use the links in the table to see descriptions and examples for each function. For an overview about using functions with commands, see Statistical and charting functions.

Optional arguments

Description: When in prestats mode (prestats=t), enables append=t where the prestats results append to existing results, instead of generating them.

Default: false

allow_old_summaries

Syntax: allow_old_summaries=true | false

Description: Only applies when selecting from an accelerated data model. To return results from summary directories only when those directories are up-to-date, set this parameter to false. If the data model definition has changed, summary directories that are older than the new definition are not used when producing output from tstats. This default ensures that the output from tstats will always reflect your current configuration. When set to true, tstats will use both current summary data and summary data that was generated prior to the definition change. Essentially this is an advanced performance feature for cases where you know that the old summaries are "good enough".

Default: false

chunk_size

Syntax: chunk_size=<unsigned_int>

Description: Advanced option. This argument controls how many events are retrieved at a time within a single TSIDX file when answering queries. Only consider supplying a lower value for this if you find a particular query is using too much memory. The case that could cause this would be an excessively high cardinality split-by, such as grouping by several fields that have a very large amount of distinct values. Setting this value too low can negatively impact the overall run time of your query.

Default: 10000000 (10 million)

local

Syntax: local=true | false

Description: If true, forces the processor to be run only on the search head.

Default: false

prestats

Syntax: prestats=true | false

Description: Specifies whether to use the prestats format. The prestats format is a Splunk internal format that is designed to be consumed by commands that generate aggregate calculations. When using the prestats format you can pipe the data into the chart, stats, or timechart commands, which are designed to accept the prestats format. When prestats=true, AS instructions are not relevant. The field names for the aggregates are determined by the command that consumes the prestats format and produces the aggregate output.

Default: false

summariesonly

Syntax: summariesonly=<bool>

Description: Only applies when selecting from an accelerated data model. When false, generates results from both summarized data and data that is not summarized. For data not summarized as TSIDX data, the full search behavior will be used against the original index data. If set to true, 'tstats' will only generate results from the TSIDX data that has been automatically generated by the acceleration and non-summarized data will not be provided.

Default: false

FROM clause arguments

The FROM clause is optional. You can specify either a namespace, an sid, or a datamodel.

namespace

Syntax: <string>

Description: Define a location for the tsidx file with $SPLUNK_DB/tsidxstats. If you have Splunk Enterprise, you can configure this location by editing the local version of the indexes.conf file and setting the tsidxStatsHomePath attribute. See How to edit a configuration file in the Admin manual.

WHERE clause arguments

The WHERE clause is optional. This clause is used as a filter. You can specify either a search or a field and a set of values with the IN operator.

<search-query>

Specify search criteria to filter on.

<field> IN (<value-list>)

For the field, specify a list of values to include in the search results.

BY clause arguments

The BY clause is optional. You cannot use wildcards in the BY clause with the tstats command. See Usage. If you use the BY clause, you must specify a field-list. You can also specify a span.

<field-list>

Syntax: <field>, ...

Description: Specify one or more fields to group results.

span

Syntax: span=<timespan>

Description: The span of each time bin. If you use the BY clause to group by _time, use the span argument to group the time buckets. You can specify timespans such as BY _time span=1h or BY _time span=5d. If you do not specify a <timespan>, the default is auto, which means that the number of time buckets adjusts to produce a reasonable number of results. For example if initially seconds are used for the <timespan> and too many results are being returned, the <timespan> is changed to a longer value, such as minutes, to return fewer time buckets.

Default: auto

<timespan>

Syntax: auto | <int><timescale>

<timescale>

Syntax: <sec> | <min> | <hr> | <day> | <month>

Description: Time scale units. For the tstats command, the <timescale> does not support subseconds.

Default: sec

Time scale

Syntax

Description

<sec>

s | sec | secs | second | seconds

Time scale in seconds.

<min>

m | min | mins | minute | minutes

Time scale in minutes.

<hr>

h | hr | hrs | hour | hours

Time scale in hours.

<day>

d | day | days

Time scale in days.

<month>

mon | month | months

Time scale in months.

Usage

The tstats command is a generating command. Generating commands use a leading pipe character.
The tstats command must be the first command in a search pipeline, except when (append=true).

Wildcard characters

The tstats command does not support wildcard characters in field values in aggregate functions or BY clauses.

For example, you cannot specify | tstats avg(foo*) or | tstats count WHERE host=x BY source*.

Any results returned where the aggregate function or BY clause includes a wildcard character are only the most recent few minutes of data that has not been summarized. Include the summariesonly=t argument with your tstats command to return only summarized data.

Nested eval expressions not supported

You cannot use eval expressions inside aggregate functions with the tstats command.

For example, | tstats count(eval(...)) is not supported.

While nested eval expressions are supported with the stats command, they are not supported with the tstats command.

Functions and memory usage

Some functions are inherently more expensive, from a memory standpoint, than other functions. For example, the distinct_count function requires far more memory than the count function. The values and list functions also can consume a lot of memory.

If you are using the distinct_count function without a split-by field or with a low-cardinality split-by by field, consider replacing the distinct_count function with the the estdc function (estimated distinct count). The estdc function might result in significantly lower memory usage and run times.

Memory and maximum results

In the limits.conf file, the maxresultrows setting in the [searchresults] stanza specifies the maximum number of results to return. The default value is 50,000. Increasing this limit can result in more memory usage.

The max_mem_usage_mb setting in the [default] stanza is used to limit how much memory the tstats command uses to keep track of information. If the tstats command reaches this limit, the command stops adding the requested fields to the search results. You can increase the limit, contingent on the available system memory.

If you are using Splunk Cloud and want to change either of these limits, file a Support ticket.

Complex aggregate functions

The tstats command does not support complex aggregate functions such as ...count(eval('Authentication.action'=="failure")).

Consider the following query. This query will not return accurate results because complex aggregate functions are not supported by the tstats command.

Selecting data

Use the tstats command to perform statistical queries on indexed fields in tsidx files. You can select the data for the indexed fields in several ways.

Normal index data

Use a FROM clause to specify a namespace, search job ID, or data model. If you do not specify a FROM clause, the Splunk software selects from index data in the same way as the search command. You are restricted to selecting data from your allowed indexes by user role. You control exactly which indexes you select data from by using the WHERE clause. If no indexes are mentioned in the WHERE clause, the Splunk software uses the default indexes. By default, role-based search filters are applied, but can be turned off in the limits.conf file.

Data manually collected with the tscollect command

You can select data from your namespace by specifying FROM <namespace>. If you did not specify a namespace with the tscollect command, the data is collected into the dispatch directory of that job. If the data is in the dispatch directory, you select the data by specifying FROM sid=<tscollect-job-id>.

An accelerated data model

You can select data from a high-performance analytics store, which is a collection of .tsidx data summaries, for an accelerated data model. You can select data from this accelerated data model by using FROM datamodel=<data_model_name>.

Search filters cannot be applied to accelerated data models. This includes both role-based and user-based search filters.

An accelerated data model dataset

When you select data within an accelerated data model, you can further constrain your search by indicating a dataset within that data model that you want to select data from. You do this by using a WHERE clause to indicate the nodename of the data model dataset. The nodename value indicates where the dataset is in a data model hierarchy.

When you use nodename in a search, you always use the following construction: FROM datamodel=<data_model_name> where nodename=<root_dataset_name>.<parent_dataset_name>.<...>.<target_dataset_name>.

For example, suppose you want to search on a dataset named scheduled_reports in your internal_server data model. In that data model, the scheduled_reports dataset is a child of the scheduler dataset, which in turn is a child of the server root event dataset. This means that you should represent the scheduled_report dataset in your search as nodename=server.scheduler.scheduled_reports.

If you run that search and decide you want to search on the contents of the scheduler data model dataset instead, you would use nodename=server.scheduler in your new search.

Search filters cannot be applied to accelerated data model datasets. This includes both role-based and user-based search filters.

You might see a count mismatch in the events retrieved when searching tsidx files. It is not possible to distinguish between indexed field tokens and raw tokens in tsidx files. On the other hand, it is more explicit to run the tstats command on accelerated data models or from a tscollect command, where only the fields and values are stored and not the raw tokens.

Filtering with WHERE

You can provide any number of aggregates (aggregate-opt) to perform and also have the option of providing a filtering query using the WHERE keyword. This query looks like a normal query you would use in the search processor. This supports all the same time arguments as search, such as earliest=-1y.

Grouping by _time

You can provide any number of GROUPBY fields. If you are grouping by _time, supply a timespan with span for grouping the time buckets, for example ...BY _time span=1h or ...BY _time span=3d.

Examples

Example 1: Gets the count of all events in the mydata namespace.

| tstats count FROM mydata

Example 2: Returns the average of the field foo in mydata, specifically where bar is value2 and the value of baz is greater than 5.

| tstats avg(foo) FROM mydata WHERE bar=value2 baz>5

Example 3: Gives the count by source for events with host=x.

| tstats count WHERE host=x BY source

Example 4: Gives a timechart of all the data in your default indexes with a day granularity.

| tstats prestats=t count BY _time span=1d | timechart span=1d count

Example 5: Use prestats mode in conjunction with append to compute the median values of foo and bar, which are in different namespaces.

Enter your email address, and someone from the documentation team will respond to you:

Send me a copy of this feedback

Please provide your comments here. Ask a question or make a suggestion.

Feedback submitted, thanks!

You must be logged into splunk.com in order to post comments.
Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic.
If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk,
consider posting a question to Splunkbase Answers.

0
out of 1000 Characters

Your Comment Has Been Posted Above

We use our own and third-party cookies to provide you with a great online experience. We also use these cookies to improve our products and services, support our marketing campaigns, and advertise to you on our website and other websites. Some cookies may continue to collect information after you have left our website.
Learn more (including how to update your settings) here »