fillmissing

When you run a standard group-by query, Sumo Logic only returns non-empty groups in the results. For example, if your query is grouping by timeslice, then only the timeslices that have data are returned.

This can be a problem because:

The lack of data is sometimes also an interesting event, but there is no easy way to capture this information. For example, the outlier operator cannot catch anomalies arising from missing data because it can only mark an existing timeslice as anomalous.

Missing data can lead to misleading visualizations. For example, if you plot a line chart across timeslices with missing data, the chart will interpolate across the missing timeslices and represent them deceptively as nonempty.

The fillmissing operator addresses this shortcoming by allowing you to specify groups that should be represented in the output, even if those groups have no data.

How it works

The fillmissing operator allows you to define generators over the fields in the output. Each generator applies to one field and enumerates all the values that you would like to appear in the output for that field, even if some of those values are not present in data.

You can define multiple generators, which enumerate tuples for every combination of the values enumerated by each of the generators (i.e., the Cartesian product). For example, if you used the following query:

The generators for the three fields (A, B, and C) enumerate the tuples:

A

B

C

a1

b1

c1

a1

b1

c2

a1

b1

c3

a2

b1

c1

a2

b1

c2

a2

b1

c3

Fields with generators defined are called key fields. The remaining fields are called non-key fields.

Given the list of generators, fillmissing ensures that every tuple enumerated by the generators are present in the output. In particular, when one of the enumerated tuples is missing, fillmissing will append a record to the output with the missing values for the key fields and some constant default value for the non-key fields.

Generators

The fillmissing operator currently supports the following types of generators:

Timeslice. Enumerates all the timeslices with a given granularity in a query time range. For example, timeslice(15m) enumerates all the 15-minute timeslices in the query time range.
Buckets need to be based on a time period.

Values. Enumerates the fixed set of values given in arguments. For example, values("a", "b", "c") enumerates the values "a", "b", and "c". Currently, only string literals are supported for the arguments.

Default values for non-key fields

When fillmissing appends a record to the output, the key fields of the record contain the missing values, while the remaining fields contain some constant value. You can configure the constant value for those fields. If you don't, a default value is assigned that depends on the type of the field:

Field Type

Default Value

integer

0

double

0.0

boolean

false

String (or other)

null

IMPORTANT: The fillmissing operator allows generators to enumerate up to 10,000 combinations of values; the same limit as group-by operators. If the limit is exceeded, extra values are omitted. The operator will issue a warning when this happens.

Syntax

This section describes the syntax for the fillmissing operator.

fillmissing <keyFieldGenerator> [ with <nonKeyFieldSpecs> ]

The keyFieldGenerator generates key fields that the operator then references to ensure all specified combination of values are present. Any missing values are filled based on the specified nonKeyFieldSpecs, one for each key field. Two generators are supported:

Generator

Syntax

Timeslice

timeslice(<integer><time_period>) [in <field>]

Values

values("<value1>", "<value2>", ...) in <field>

The nonKeyFieldSpecs are optional. They allow you to configure the default constant values for one or more non-key fields. The syntax for each specification looks like:

Non key fields

Syntax

Constant value <constantValue>

<double> | <int> | <stringLiteral> | null

Constant non key

<constantValue> for <field>

Rules

In Live Dashboards, you must use the fillmissing operator after an aggregate operator.

Buckets from the timeslice generator need to be based on a time period. Supported <time_period> values are weeks (w), days (d), hours (h), minutes (m), and seconds (s).

Examples

Timeslice generator

This example query counts the number of login events in 15-minute timeslices. Notice that in the query with fillmissing, timeslices with a count of zero are shown in the output.

Notice also that for the timeslice generator, the key field name is optional. (It defaults to _timeslice.)

Multiple generators and transpose

This example shows how multiple generators can be used to enumerate every combination of the required values in two key fields (type and _timeslice). Notice that while transpose can show some of the missing values, it misses the rows where none of the types have any data.

Note that in this simple example, you can achieve a similar effect without the need to specify all the expected values for the type field, by applying the fillmissing operator after the transpose, like this:

Recommended articles

Sumo Logic is the industry’s leading secure, cloud-native, machine data analytics service, delivering real-time, continuous intelligence across the entire application lifecycle and stack. More than 1,000 customers around the globe rely on Sumo Logic for the analytics and insights to build, run and secure their modern applications and cloud infrastructures.