Logstash Input Plugins

In this article by Saurabh Chhajed, author of the book Learning ELK Stack, he has covered Logstash input plugins. Logstash has a variety of plugins to help integrate it with a variety of input and output sources. Let's explore the various plugins available.

Listing all plugins in Logstash

You can execute the following command to list all available plugins in your installed Logstash version:

bin/plugin list

Also, you can list all plugins containing a name fragment by executing this command:

bin/plugin list <namefragment>

To list all plugins for group names, input, output, or filter, we can execute this command:

bin/plugin list --group <group name>
bin/plugin list --group output

Before exploring various plugin configurations, let's take a look at the data types and conditional expressions used in various Logstash configurations.

Data types for plugin properties

A Logstash plugin requires certain settings or properties to be set. Those properties have certain values that belong to one of the following important data types.

Array

An array is collection of values for a property.

An example can be seen as follows:

path => ["value1","value2"]

The => sign is the assignment operator that is used for all properties of configuration values in Logstash configuration.

Boolean

A boolean value is either true or false (without quotes).

An example can be seen as follows:

periodic_flush => false

Codec

Codec is actually not a data type but a way to encode or decode data at input or output.

An example can be seen as follows:

codec => "json"

This instance specifies that this codec, at output, will encode all output in JSON format.

Hash

Hash is basically a key value pair collection. It is specified as "key" => "value" and multiple values in a collection are separated by a space.

An example can be seen as follows:

match => {
"key1" => "value1" "key2" => "value2"}

String

String represents a sequence of characters enclosed in quotes.

An example can be seen as follows:

value => "Welcome to ELK"

Comments

Comments begin with the # character.

An example can be seen as follows:

#this represents a comment

Field references

Fields can be referred to using [field_name] or nested fields using [level1][level2].

Logstash conditionals

Logstash conditionals are used to filter events or log lines under certain conditions. Conditionals in Logstash are handled like other programming languages and work with if, if else and else statements. Multiple if else blocks can be nested.

Types of Logstash plugins

Now let's take a look at some of the most important input, output, filter and codec plugins, which will be useful for building most of the log analysis pipeline use cases.

Input plugins

An input plugin is used to configure a set of events to be fed to Logstash. Some of the most important input plugins are:

file

The file plugin is used to stream events and log lines files to Logstash. It automatically detects file rotations, and reads from the point last read by it.

The Logstash file plugin maintains sincedb files to track the current positions in files being monitored. By default it writes sincedb files at $HOME/.sincedb*path. The location and frequency can be altered using sincedb_path and sincedb_write_interval properties of the plugin.

A most basic file configuration looks like this:

input{
file{
path => "/path/to/logfiles"
}

The only required configuration property is the path to the files. Let's look at how we can make use of some of the configuration properties of the file plugin to read different types of files.

Configuration options

The following configuration options are available for the file input plugin:

add_field

It is used to add a field to incoming events, its value type is Hash, and default value is {}.

Let's take the following instance as an example:

add_field => { "input_time" => "%{@timestamp}" }

codec

It is used to specify a codec, which can decode a specific type of input.

For example: codec => "json" is used to decode the json type of input.

The default value of codec is "plain".

delimiter

It is used to specify a delimiter, which identifies separate lines. By default, it is "\n".

exclude

To exclude certain types of files from the input path, the data type is array.

Let's take the following instance as an example:

path =>["/app/packtpub/logs/*"]
exclude => "*.gz"

This will exclude all gzip files from input.

path

This is the only required configuration for the file plugin. It specifies an array of path locations from where to read logs and events.

sincedb_path

It specifies the location where to write the sincedb files, which keeps track of the current position of files being monitored. The default is $HOME/.sincedb*

sincedb_write_interval

It specifies how often (number in seconds), the sincedb files that keep track of the current position of monitored files, are to be written. The default is 15 seconds.

start_position

It has two values: "beginning" and "end". It specifies where to start reading incoming files from. The default value is "end", as in most situations this is used for live streaming data. Although, if you are working on old data, it can be set to "beginning".

This option has impact only when a file is being read for the first time, called "first contact", as it maintains the location in the "sincedb" location. So for the next setting, this option has no impact unless you decide to remove the sincedb files.

tags

It specifies the array of tags that can be added to incoming events. Adding tags to your incoming events helps with processing later, when using conditionals. It is often helpful to tag certain data as "processed" and use those tags to decide a future course of action.

For example, if we specify "processed" in tags:

tags =>["processed"]

In filter, we can check in conditionals:

filter{
if "processed" in tags[]{
}
}

type

The type option is really helpful to process the different type of incoming streams using Logstash. You can configure multiple input paths for different type of events, just give a type name, and then you can filter them separately and process.

As in the preceding example, we have configured a separate type for incoming files; "syslog" and "apache". Later in filtering the stream, we can specify conditionals to filter based on this type.

stdin

The stdin plugin is used to stream events and log lines from standard input.

A basic configuration for stdin looks like this:

stdin {
}

When we configure stdin like this, whatever we type in the console will go as input to the Logstash event pipeline. This is mostly used as the first level of testing of configuration before plugging in the actual file or event input.

Configuration options

The following configuration options are available for the stdin input plugin:

add_field

The add_field configuration for stdin is the same as add_field in the file input plugin and is used for similar purposes.

codec

It is used to decode incoming data before passing it on to the data pipeline. The default value is "line".

tags

The tags configuration for stdin is the same as tags in the file input plugin and is used for similar purposes.

type

The type configuration for stdin is the same as type in the file input plugin and is used for similar purposes.

twitter

You may need to analyze a Twitter stream based on a topic of interest for various purposes, such as sentiment analysis, trending topics analysis, and so on. The twitter plugin is helpful to read events from the Twitter streaming API. This requires a consumer key, consumer secret, keyword, oauth token, and oauth token secret to work.

oauth_token_secret

The oauth_token_secret option is obtained from the Twitter dev API page.

tags

The tags configuration for the twitter input plugin is the same as tags in the file input plugin and is used for similar purposes.

type

type configuration for twitter input plugins is the same as type in the file input plugin and is used for similar purposes.

lumberjack

The lumberjack plugin is useful to receive events via the lumberjack protocol that is used in Logstash forwarder.

The basic required configuration option for the lumberjack plugin looks like this:

lumberjack {
port =>
ssl_certificate =>
ssl_key =>
}

Lumberjack or Logstash forwarder is a light weight log shipper used to ship log events from source systems. Logstash is quite a memory consuming process, so installing it on every node from where you want to ship data is not recommended. Logstash forwarder is a light weight version of Logstash, which provides low latency, secure and reliable transfer, and provides low resource usage.

Configuration options

The following configuration options are available for the lumberjack input plugin:

add_field

The add_field configuration for the lumberjack plugin is the same as add_field in the file input plugin and is used for similar purposes.

codec

The codec configuration for the lumberjack plugin is the same as the codec plugin in the file input plugin and is used for similar purposes.

host

It specifies the host on which to listen to. The default value: "0.0.0.0".

port

This is a number type required configuration and it specifies the port to listen to. There is no default value.

ssl_certificate

It specifies the path to the SSL certificate to be used for the connection. It is a required setting.

An example is as follows:

ssl_certificate => "/etc/ssl/logstash.pub"

ssl_key

It specifies the path to the SSL key that has to be used for the connection. It is also a required setting.

An example is as follows:

ssl_key => "/etc/ssl/logstash.key"

ssl_key_passphrase

It specifies the SSL key passphrase that has to be used for the connection.

tags

The tags configuration for the lumberjack input plugin is the same as tags in the file input plugin and is used for similar purposes.

type

The type configuration for the lumberjack input plugins is the same as type in the file input plugin and is used for similar purposes.

redis

The redis plugin is used to read events and logs from the redis instance.

Redis is often used in ELK Stack as a broker for incoming log data from the Logstash forwarder, which helps to queue data until the time the indexer is ready to ingest logs. This helps to keep the system in check under heavy load.

The basic configuration of the redis input plugin looks like this:

redis {
}

Configuration options

The following configuration options are available for the redis input plugin:

add_field

The add_field configuration for redis is the same as add_field in the file input plugin and is used for similar purposes.

codec

The codec configuration for redis is the same as codec in the file input plugin and is used for similar purposes.

data_type

The data_type option can have a value as either "list", "channel" or "pattern_channel".

Alerts & Offers

Series & Level

We understand your time is important. Uniquely amongst the major publishers, we seek to develop and publish the broadest range of learning and information products on each technology. Every Packt product delivers a specific learning pathway, broadly defined by the Series type. This structured approach enables you to select the pathway which best suits your knowledge level, learning style and task objectives.

Learning

As a new user, these step-by-step tutorial guides will give you all the practical skills necessary to become competent and efficient.

Beginner's Guide

Friendly, informal tutorials that provide a practical introduction using examples, activities, and challenges.

Essentials

Fast paced, concentrated introductions showing the quickest way to put the tool to work in the real world.

Cookbook

A collection of practical self-contained recipes that all users of the technology will find useful for building more powerful and reliable systems.

Blueprints

Guides you through the most common types of project you'll encounter, giving you end-to-end guidance on how to build your specific solution quickly and reliably.

Mastering

Take your skills to the next level with advanced tutorials that will give you confidence to master the tool's most powerful features.

Starting

Accessible to readers adopting the topic, these titles get you into the tool or technology so that you can become an effective user.

Progressing

Building on core skills you already have, these titles share solutions and expertise so you become a highly productive power user.