Jul 3, 2013

Splunk to analyse Java logs and other machine data

Q. What is Splunk and where will you use it?A. Splunk is an enterprise-grade software tool for collecting and analyzing “machine data” like log files, feed files, and other big data in terra bytes. You can upload logs from your websites and let Splunk index them, and produce reports with graphs to analyze the reports. This is very useful in capturing start and finish times from asynchronous processes to calculate elapsed times. For example, here are the basic steps required.

Step 1: log4j MDC logging can be used to output context based logs and then Step 2: upload that into Splunk to index and produce elapsed times for monitoring application performance.

Q. What are the different ways to get data into Splunk?A.

Uploading a log file via Splunk's web interface.

Getting Splunk to monitor a local directory or file.

Splunk can index data from any network port. For example, Splunk can index remote data from syslog-ng or any other application that transmits via TCP. Splunk can also receive and index SNMP events.

Splunk also supports other kinds of data sources like FIFO queues and Scripted inputs to get data from APIs and other remote data interfaces and message queues. for example, here is a simple scripted script via input.conf file.

[script://$SCRIPT]
<attrbute1> = <val1>
<attrbute2> = <val2>
...

Here is an example of using Splunk to write query against log files to monitor performance.