Monthly Archive for October, 2013

Event handlers have been a major part of Nagios Core and Nagios XI for long time and can greatly increase the efficiency of your network and decrease incident response time. From restarting downed services, forwarding emails to a ticketing systems, or updating thousands of servers, the possibilities for event handlers are limited only by your imagination. Below, you will find the introductory portion of the documentation which will walk you through the creation of your first event handler in its most basic form. If the following example isn’t enough, the full documentation includes a complete tutorial for event handler macros, and more advanced users can learn how to fully customize their Nagios environment.

Step I. Create a command in XI for the event handler

In XI, go to the CCM (Configure → Core Config Manager → Commands). Click “Add New” in the right hand pane. Give the command a name, we will use “event_handler_test” for this example.

Now we need to define the “Command Line”. “$USER1$” references the folder “/usr/local/nagios/libexec” from the resources.cfg file. This is the default path for plugins and scripts in XI. We will create a script later on to be called by this command, and for the sake of uniformity, we will name our script “event_handler_test.sh”.

As this command will not be used to check hosts nor services, but will be used as an event handler, set the “Command Type” dropdown to “misc command”.

Make sure the “Active” box is checked.

The final command definition should resemble the image to the right. Click save when finished.

Step II. Create a dummy host

In XI, go to the CCM (Configure → Core Config Manager → Monitoring → Hosts). Click “Add New”. We will set the “Host Name” field as “event_handler_test”.

As this is just a mock up, set “Address” to the localhost “127.0.0.1”.

Clicking “Test Check Command” should output “OK”, this means you have set up the command correctly.

Set the required defaults:

Under the “Check Settings” tab, enter “5” for “Check interval” and “Max retry attempts”, and “1” for “Retry interval”.

Set the “Check period” dropdown to “xi_timeperiod_24x7”

Under the “Alert Settings” tab, set the “notification period” dropdown to “xi_timeperiod_24x7”.

Step III. Add the event handler to the dummy host

Under the tab “Check Settings” for the dummy host choose “event_handler_test” in the “Event handler” dropdown box.

Enable the event handler by clicking the bullet “on” for the “Event handler enabled” option. Save out and “Apply configuration”.

Step IV. Create a script for the handler to run

Now that the XI event handler and host object are configured, we need to create a script to be run by the handler. The first iteration of the script will just output some *very* basic information to a text file in /tmp. In the proceeding sections of this document we will add functionality and complexities to the script and command of the handler (mostly macros and script logic). But before we get to the fun stuff, lets make sure the basic script works.

Open up a terminal to the XI server and navigate to plugin directory.

cd /usr/local/nagios/libexec

Now create a file named “event_handler_test.sh”, make it executable and edit it:

#!/bin/bash
DATE=$(date) #sets the DATE variable to the output of the 'date' command
echo "The host has changed state at $DATE" > /tmp/hostinfo.txt #echos text to a file including the contents of the DATE variable.

Save out.

Step V. Test the event handler

Now that everything is in place, we need to force the dummy host into a down state to verify if the hostinfo.txt file is created in /tmp. The easiest was to do this is to submit a passive check with the “Check Result” of “DOWN”. This should trigger the event handler as it will register as a state change.

Click “Commit”, and then done. You should see the host change to a down state after a moment or two. Now lets check the hostinfo.txt file from the XI server’s CLI:

cat /tmp/hostinfo.txt

You should see output resembling:

The host has changed state at Sun Jun 2 22:36:22 CDT 2013

Congratulations! You have created your first event handler. With this knowledge, you can automate nearly anything in your environment to keep it running smoothly. Don’t know bash scripting? No problem! The full document, linked below, includes a number of use cases in bash to get you started. From emailing a ticketing system to performing yum updates, event handlers can help you create a more self-sufficient network environment and integrate with other systems in your organization.

If you missed my post from March 2, 2012 on how to clone a host along with its services in Nagios XI, now is the time to revisit this topic. There is a new video available that shows you how to use this amazing tool.

Sometimes it can be a daunting task to set up a monitoring system, especially when you have hundreds of machines that you are trying to monitor, but Nagios makes it simple. The Bulk Host Cloning and Import Wizard is easy to use and allows you to clone a host that has been already set up, according to your specifications, and then copy this “template host” into many other hosts that you want to monitor in the same manner.

For example, if you have 500 devices that you need monitored and you want them all monitored in the same manner, all you need to do is set up the first one. Once that is done, you can clone that template via the Bulk Host Import Wizard into the other 499 hosts so that they will be set up in the exact same manner. This will save you a lot of time and effort!

To see how easy it is to use the Bulk Host Cloning and Import Wizard, please watch our video below. The video is less than 3 minutes long, so it’s not going to take much of your time, but it will teach you:

Note: If you watched my presentation (“Bulk Management Of Host And Services In Nagios XI“) during the Nagios World Conference 2012, you probably remember that in the past, you couldn’t clone a host, unless you had at least one service selected. This is changed in the new version of Bulk Host Cloning and Import Wizard as per customer requests. Thank you for your feedback! Now, you can even clone hosts, with no services.

Major improvements to agent-based monitoring have been taking place at Nagios Enterprises. NCPA, the Nagios Cross-Platform Agent, is a project that has the potential to revolutionize agent-based monitoring and increase the efficiency of IT support teams world-wide.

As many Nagios users know, monitoring with agents means juggling the installation of many different types of plugins to try and match devices, operating systems, and the basic functions of each agent. For example, in a simple agent-based Linux and Windows server environment you have to install 2 agents, know the 2 user manuals, there are 2 times the troubleshooting hours required, 2 times the commands on remote systems, 2 change logs to sift through for potential update breaks…the list goes on. It can be very difficult to keep organized and take a lot of time to implement and update your configuration, especially when your monitoring environment becomes larger and more complex.

Whether your environment is large or small, there are usually a myriad of devices that need to be monitored and more often than not, some sort of agent needs to be installed on these devices.

Wouldn’t it be simple if you only had to install one agent regardless of operating system or device?

We have been working on a project that aims to do this. Nagios Cross-Platform Agent (NCPA) is a fully contained agent that runs on Mac OS X, Windows, and Linux and seeks to solve all of the previously mentioned pitfalls of agent based monitoring with Nagios. The main goal of NCPA was to monitor the core metrics of a server and other devices without the added hassle of plugins and dependencies. Metrics such as CPU Usage, Disk Usage, Memory Usage, Interface Usage, Swap Usage, User Count, etc. are preloaded in NCPA so that all you have to do is install the agent. It has since broadened in scope to be a general purpose agent that is very good at doing the aforementioned job. Just install the NCPA agent on your system, and away you go.