More over, another very useful source of information is in the log files.

If these two sources are very interesting, for a “real life” monitoring, we need some additional features:

The JMX information and log messages should be stored in order to be requested later and history. For instance, using jconsole, you can request all the JMX attributes to get the number, but these numbers have to be store somewhere. It’s quite the same for the log. Most of the time, you define a log file rotation, or you periodically cleanup the logs. So the log messages should be store as well to be requested later.

Numbers are good, graphics are even better. Once the JMX “numbers” are stored somewhere, a good feature is to use these numbers to create some charts. And also, we can define some kind of SLA: at some point, if a number is not “acceptable” for instance greater than a “watermark” value), we should raise a alert.

For high availability and scalability, most of production systems use multiple Karaf instances (synchronize with Cellar for instance). It means that the log files are spread on different machines. In that case, it’s really helpful to “centralize” the log messages.

Of course, there are already open source solutions (zabbix, nagios, etc) or commercial solutions (dynatrace, etc) to cover these needs.

In this blog, I just introduce a possible solution leveraging “big data” tools: we will see how to use the ELK (Elasticsearch, Logstash, and Kibana) solution.

Toplogy

For this example, let say we have to following architecture:

node1 is a machine hosting a Karaf container with a set of Camel routes.

node2 is a machine hosting a Karaf container with another set of Camel routes.

node3 is a machine hosting a ActiveMQ broker (used by the Camel routes from node1 and node2).

monitor is a machine hosting the monitoring platform.

Local to node1, node2, and node3, we install and configure logstash with both file and JMX input plugins. This logstash will get the log messages and pool JMX MBeans attributes, and send to a “central” Redis server (using the redis output plugin).

On monitor, we install:

redis server to receive the messages and events coming from logstash installed on node1, node2, and node3

elasticsearch to store the messages and events

a first logstash acting as an indexer to take the messages/events from redis and store into elasticsearch (including the update of indexes, etc)

a second logstash providing the kibana web console

Redis and Elasticsearch

Redis

Redis is a key-value store. But it also may acts as a broker to receive the messages/events from the different logstash (node1, node2, and node3).

JMX is not a “standard” logstash plugin. It’s a plugin from logstash-contrib project. As I modified the logstash JMX plugin (to work “smoothly” with Karaf MBeanServer), waiting that my pull request will be integrated in logstash-contrib (I hope ;)), you have to clone my github fork:

So, the jmx input plugin reads all files located in the /opt/monitor/logstash-1.4.0.rc1/conf/jmx folder.

On node1 and node2 (again hosting a karaf container with camel routes), for instance, we want to monitor the number of thread on the Karaf instance (using the thread MBean), and a route named “route1” (using the Camel route MBean).
We specify this in /opt/monitor/logstash-1.4.0.rc1/conf/jmx/karaf file:

On the welcome page, we click on the “Logstash dashboard” link, and we arrive on a console looking like:

It’s time to configure Kibana.

We remove the default histogram, to add a custom one to chart the thread count.

First, we create a query to isolate the thread count for node1. Kibana uses the Apache Lucene query syntax.
Our query is here very simple: metric_path:"node1.Threading.ThreadCount".

Now, we can create a histogram using this query, getting the metric_value_number:

Now, we want to chart the lastProcessingTime on the Camel route (to see for instance if the route takes more time at some point).
We create a new query to isolate the route1 lastProcessingTime on node1: metric_path:"node1.Route1.LastProcessingTime".

We can now create a histogram using this query, getting the metric_value_number:

For the demo, we can create a histogram chart to display the exchanges completed and failed for route1 on node1. We create two queries:

metric_path:”node1.Route1.ExchangesFailed”

metric_path:”node1.Route1.ExchangesCompleted”

We create a new chart in the same row:

We cleanup a bit the events panel. We create a query to display only the log messages (not the JMX queries): type:"log".
We configure the log event panel to change the name and use the log query:

We have now a kibana console looking like:

With this very simple kibana configuration, we have:
– a chart of the thread count on node1
– a chart of the last processing time for route1 (on node1)
– a chart of the exchanges (failed/completed) for route1 (on node1)
– a view of all logs messages

You can now play with Kibana, add a lot of new charts leveraging all information that you have into elasticsearch (both log messages and JMX data).

Next

I’m working on some new Karaf, Cellar, ActiveMQ, Camel features providing “native” and “enhanced” support for logstash. The purpose is to just type feature:install monitoring to get: