Tag: Kibana

I’ve just published a new blog post on the Skedler Blog. In this two-part blog post, we are going to present a system architecture to convert audio and voice into written text with AWS Transcribe, extract useful information for quick understanding of content with AWS Comprehend, index this information in Elasticsearch 6.2 for fast search and visualize the data with Kibana 6.2. In Part I, you can learn about the key components, architecture, and common use cases. In Part II, you can learn how to implement this architecture.

Adding APM (Application Performance Monitoring) to the Elastic Stack is a natural next step in providing our users with end-to-end monitoring, from logging, to server-level metrics, to application-level metrics, all the way to the end-user experience in the browser or client.

Elastic APM consists of three components:

Agents: libraries that run inside of your application process and automatically measure the duration of requests to your service and things like database queries, cache calls, external http requests and errors

The APM server (written in Golang) that processes data from agents and stores the data in Elasticsearch

Kibana UI: dashboards that gives you an instant overview of application response times, requests per minutes, error occurrences and more.

The APM server and the agents (right now available only for Python and NodeJS) are open source:

In Kibana version 5.5 a new type of chart has been added: Region Map. Region maps are thematic maps in which boundary vector shapes are colored using a gradient: higher intensity colors indicate larger values, and lower intensity colors indicate smaller values. These are also known as choropleth maps.

In this post we are going to see how to use the Region Map to visualize the geolocation detail of a stream of Tweets (consumed using the Twitter streaming API). Basically we will show the location (by country) of a stream of Tweets on the map (higher intensity colors indicate larger volume of Tweets).

I am using Elasticsearch and Kibana version 5.5 on Ubuntu 14.04 and Python 3.4.

We are going to use the Twitter streaming API to consume the public data stream flowing through Twitter (set some hashtags/keywords to filter the tweets). Given the latitude and longitude (GEOJson format) of each tweet (when available) we are going to use the Google Maps API (Geocoding) to get the country name (or code) from the latitude and longitude.
Once we identified the country (given the latitude and longitude), we are going to index the Tweet to Elasticsearch and then visualize its location using the Kibana Region Map. For each Tweet we are interested to the country (that represents the geographic location of the Tweet as reported by the user or client application), the text (for further query) and the creation date (to filter our result).

First of all, define a new Elasticsearch mapping called tweet, within the index tweetrepository:

1

2

3

4

5

6

7

8

9

curl-XPUT"esHost:9200/tweetrepository/tweet/_mapping"-d'{

"tweet" : {

"properties" : {

"text" : { "type" : "string" },

"country" : { "type" : "keyword" },

"timestamp": {"type": "date"}

}

}

}'

Notice that the country field is a keyword field type (A field to index structured content such as email addresses, hostnames, status codes, zip codes or tags). It will be used as join (between the map and the term aggregation) field for the Region Map visualization.

res=es.index(index='tweetrepository',doc_type='tweet',body=doc)# Index the document to Elasticsearch

This is how an indexed document looks like.

We are going now to create a new region map visualization.

In the option section of the visualization, select the Vector Map. This is the map layer that will be used. This list includes the maps that are hosted by the Elastic Maps Service as well as your self-hosted layers that are configured in the config/kibana.yml file. To learn more about how to configure Kibana to make self-hosted layers available, see the region map settings documentation.

We will use the World Country vector map. The join field is the property from the selected vector map that will be used to join on the terms in your terms-aggregation. In this example the join field is the country name (so we can match the regions of the map with our documents).

In the style section you can choose the color schema (red to green, shades of blue/green, heatmap) that will be used.

In the buckets section select the country field (field of our mapping). The values of this field will be used as lookup (join) on the vector map.

This is how our region map looks like. The darker countries are the one with a higher number of Tweets.

I really like this new type of visualization, it easy to use and allows you to add nice visualization map (even with self-hosted layers that are configured in the config/kibana.yml file) to your Kibana dashboards.

If you use Kibana to visualize logs and if you use Logstash take a look at this plugin: GeoIP Filter. The GeoIP filter adds information about the geographical location of IP addresses, based on data from the Maxmind GeoLite2 databases (so you can use the geographical location in your region map).

The machine learning feature is enabled by default on each node, here you can find more details about further configurations: Machine Learning Settings

We are going to use the following dataset: U.S. / U.K. Foreign Exchange Rate. It represents the daily foreign exchange rate between U.S. Dollar and U.K. Pound between April 1971 and beginning of June 2017.

This is a sample of the data:

1

2

3

4

5

6

7

8

9

10

11

12

13

DATE,EXCHANGE_RATE

1971-01-07,2.3963

1971-01-08,2.3972

1971-01-11,2.3992

1971-01-12,2.4001

1971-01-13,2.4021

1971-01-14,2.4071

1971-01-15,2.4057

1971-01-18,2.4052

1971-01-19,2.4081

1971-01-20,2.4080

1971-01-21,2.4092

1971-01-22,2.4103

We will index the documents (around 16k) in a time-based index called foreignexchangerate-YYYY (where YYYY represents the year of the document). The time-based index is necessary to use the machine learning feature. The Configured time field of the index will be used as time-aggregation by the feature.
I did not find a way (AFAIK) to use a not time-based index and select a date field while creating a machine learning job.

This is how each time-based index looks like:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

curl-XPUT'esHost:9200/foreignexchangerate-2017/'-d'{

"settings" : {

"index" : {

"number_of_shards" : 1,

"number_of_replicas" : 0

}

}

}'

curl-XPUT"esHost:9200/foreignexchangerate-2017/rate/_mapping"-d'{

"rate" : {

"properties" : {

"date" : { "type" : "date", "format": "yyyy-MM-dd"},

"exchange_rate" : { "type" : "float" }

}

}

}'

Once we indexed our documents, and once we added the index pattern to Kibana, we can create our first machine learning job.

To create a new Job, select the Machine Learning section from the left menu of Kibana (if you do not see it, maybe you have the wrong Kibana version or you did not install X-Pack into Kibana).

You can now choose between Single Metric or Multi metric job, we will choose Single Metric job (for the foreignexchangerate-* index pattern).

We will use the whole time series and a 3 days rolling exchange_rate average. The idea is to aggregate the series by 3 days, compute the average of the exchange rate and spot anomalies.

One we configure the job, we can create it. The machine learning model will be build using our time series and the aggregation/metric we specified.

We can now inspect the anomalies detected using the Anomaly Explorer or the Single Metric View, both from the ML Jobs dashboard.

I checked some of the anomalies automatically identified and almost all of them make sense (I found drop in the exchange rate due to events like Brexit or EU Crisis).

So far we see all the analysis inside Kibana but the machine learning feature comes also with a set of APIs, so you can integrate the time-series anomaly detection with your application.
Here you can find the details about the APIs: ML APis.

In this post we saw a simple example of how to create and run a machine learning job inside Elasticsearch. There are a lot of other aspects like the multi-metric and advanced-metric that I think are important.

The machine learning features are pretty new and I think (and hope!) that Elastic will invest a lot of resources to improve and extend it.

I am going to run some other tests on the ML features and I would like to run some anomaly detection algorithms (statistical and ML based) on the same dataset to benchmark and compare the Elasticsearch results, if you want to collaborate and help me (or if you have some knowledge/background about time series anomaly detection) drop me a line 🙂 .

With Kibana you can create intuitive charts and dashboards. Since Aug 2016 you can export your dashboards in a PDF format thanks to Reporting. With Elastic version 5 Reporting has been integrated in X-Pack for the Gold and Platinum subscriptions.Recently I tried Skedler, an easy to use report scheduling and distribution application for Kibana that allows you to centrally schedule and distribute Kibana Dashboards and Saved Searches as hourly/daily/weekly/monthly PDF, XLS or PNG reports to various stakeholders.Skedler is a standalone app that allows you to utilize a new dashboard where you can manage Kibana reporting tasks (schedule, dashboards and saved search). Right now there are four different price plans (from free to premium edition).
Here you can find some useful resources about Skedler:

From the Skedler dashboard we can now schedule a report.
The steps to schedule a new report are the following:

Report Details

Create a new schedule, select the Kibana dashboard or saved query and the output format. In the example I selected a dashboard called “My dashboard” (that I previously create in Kibana) and PDF format.

Layout Details

Select the font-family, page size and company logo.

Schedule Details

Define a schedule frequency and a time window for the data.

Once you finished the configuration, you fill find the new schedule in the Skedler dashboard. You can set a list of email addresses to which the report will be sent.

If you want to see how your exported dashboard will look like, you can Preview it. This is how my dashboard look like (note that it is a PDF file).

In this post I demonstrated how to install and configure Skedler and how create a simple schedule for our Kibana dashboard. My overall impression of Skedler is that it is a powerful application to use side-by-side with Kibana that helps you to deliver your contents directly to your stakeholders.

These are main benefits that Skedler offers:

It’s easy to install

Linux and Windows support (it runs on Node.js server)

Reports generated locally (your data are not sent to cloud or Skedler servers)

Competitive price plans

Support to Kibana 4 and 5 releases.

Automatically discovery your existing Kibana Dashboards and Saved Searches (so you can easily use Skedler in any environment, no new stack installation needed)

It let you centrally schedule and manage who gets which reports and when they get them

Allows for hourly, weekly, monthly, and yearly schedules

Generates XLS and PNG reports besides PDF as opposed to Elastic Reporting that only supports PDF

I strongly recommend that you try Skedler (there is a free plan and 21-days trial) because it can help you to automatically deliver reports to your stakeholders and it integrates with your ELK environment without any modification to your stack.

Here you can find some more useful resource form the official website:

This website or its third-party tools use cookies, which are necessary to its functioning and required to achieve the purposes illustrated in the cookie policy. If you want to know more or withdraw your consent to all or some of the cookies, please refer to the coockie policy. Got it!Reject.