A typical case is aggregating or tracking user behaviour. We can track a user by its ID through the events, however once the user stops interacting, the events stop coming in. There is no specific event indicating the end of the user’s interaction.

In this case, we can enable the option push_map_as_event_on_timeout to enable pushing the aggregation map as a new event when a timeout occurs.
In addition, we can enable timeout_code to execute code on the populated timeout event.
We can also add timeout_task_id_field so we can correlate the task_id, which in this case would be the user’s ID.

Fourth use case : like example #3, you have no specific end event, but also, tasks come one after the other.
That is to say : tasks are not interlaced. All task1 events come, then all task2 events come, …
In that case, you don’t want to wait task timeout to flush aggregation map.
* A typical case is aggregating results from jdbc input plugin.
* Given that you have this SQL query : SELECT country_name, town_name FROM town
* Using jdbc input plugin, you get these 3 events from :

The key point is that each time aggregate plugin detects a new country_name, it pushes previous aggregate map as a new logstash event (with aggregated tag), and then creates a new empty map for the next country

When 5s timeout comes, the last aggregate map is pushed as a new event

Finally, initial events (which are not aggregated) are dropped because useless

the filter needs a "task_id" to correlate events (log lines) of a same task

at the task beggining, filter creates a map, attached to task_id

for each event, you can execute code using event and map (for instance, copy an event field to map)

in the final event, you can execute a last code (for instance, add map data to final event)

after the final event, the map attached to task is deleted

in one filter configuration, it is recommanded to define a timeout option to protect the feature against unterminated tasks. It tells the filter to delete expired maps

if no timeout is defined, by default, all maps older than 1800 seconds are automatically deleted

all timeout options have to be defined in only one aggregate filter per task_id pattern. Timeout options are : timeout, timeout_code, push_map_as_event_on_timeout, push_previous_map_as_event, timeout_task_id_field, timeout_tags

if code execution raises an exception, the error is logged and event is tagged _aggregateexception

If the event has field "somefield" == "hello" this filter, on success,
would add field foo_hello if it is present, with the
value above and the %{host} piece replaced with that value from the
event. The second example would also add a hardcoded field.

When this option is enabled, each time a task timeout is detected, it pushes task aggregation map as a new logstash event.
This enables to detect and process task timeouts in logstash, but also to manage tasks that have no explicit end event.

The code to execute to complete timeout generated event, when push_map_as_event_on_timeout or push_previous_map_as_event is set to true.
The code block will have access to the newly generated timeout event that is pre-populated with the aggregation map.

If timeout_task_id_field is set, the event is also populated with the task_id value