Table of Contents

Monitoring

This page documents our monitoring and alerting scripts.

Munin

Munin does not do any alerting, but pulls system data periodically and displays it in RRDTools graphs. Munin comes in 2 pieces: munin and munin-node. The munin-node part is a daemon that gathers the data, and the munin part runs via cron, and aggregates the data from multiple munin daemons running on various systems.

Installing Munin (both parts) requires a few other libraries; we install it like this:

TODO

If we run munin-node on a system that we'll pull data from remotely, we'll need to edit the munin-node.conf file accordingly, and also open up TCP port 4949 via Shorewall.

If we pull data from any systems across the Internet, we should enable TLS and certificates.

Schedule Regular Updates

It would be nice to have the updates install automatically, but in order to prevent problems, it's best to have a system administrator apply the updates manually, so he can fix any problems that crop up. So instead, we'll alert the system administrators when there are updates available.

We've adapted code from here to check for new Debian updates. Save the following code to /etc/cron.daily/check-debian-updates:

Adding this script to the /etc/cron.daily directory will cause it to be run every day. By default, the daily cron scripts run at 6:25 AM. One nice thing about running them daily and sending them to a mailing list is that it's easy to see if the updates have or have not been applied by the next day. The more times the message is sent, the more likely someone will be to log in and run the updates.

Note that there are some packages out there that do this same task – cron-apt and apticron are 2 that I've come across.

Apticron

sudoapt-get installapticron

Alert on Low Disk Space

This script works much like the previous script, sending an email only if any partition is over 90% full. Save the following code to /etc/cron.daily/check-disk-space:

Root Password Change Reminders

Root passwords should be changed at least every 6 months.
We decided to send out an email reminder to help ensure that we do that.

sudosh-c'cat > /etc/cron.monthly/root-password-reminder'<<'EOFILE'
#!/bin/sh
HOSTNAME=$(hostname)
MAILTO='craig@boochtek.com'
MAILFROM='Root password reminder <root@boochtek.com>'
MONTH=$(date +'%1m')
# This checks to see if it is July or January. If so, send out the reminder.
# Since this script is in cron.monthly, it only runs on the 1st of the month.
if [ "$MONTH" = '07' -o "$MONTH" = '01' ]; then
mail -a "From: $MAILFROM" -s "Change root password on $HOSTNAME" $MAILTO <<EOF
Please change the root password on $HOSTNAME.
Whoever changes the root password, please reply to this email to
let everyone know that you've changed it. Provide your phone number
so that the other admins can call you to get the new password.
This script is located in /etc/cron.monthly/root-password-reminder,
and send emails out on July 1 and January 1.
EOF
fi
exit 0;
EOFILE# Change the permissions on the script to make it executable:sudochmod755/etc/cron.monthly/root-password-reminder

Adding this script to the /etc/cron.monthly directory will cause it to be run on the 1st day of every month. The script itself checks to see if it's January or July, and only sends an email for those months. By default, the daily cron scripts run at 6:52 AM.

File Integrity Monitoring

We chose fcheck to monitor changes to system files. It's pretty simple – it just sends an email to root with a list of files that have changed since the last time it was run.

# Install the fcheck package.sudoapt-get install fcheck
# Set the display timezone, so times are in our own timezone.sudosed-i-e"s|^TimeZone.*\$|TimeZone = $(cat /etc/timezone)|"/etc/fcheck/fcheck.cfg
# By default, fcheck runs from cron every 2 hours. We change it to run every 12 hours instead:sudosed-e's|^30 \*/2|30 */12|'-i/etc/cron.d/fcheck

TODO

If we're on a non-virtual system, we should also install lm-sensors, acpi, and smartmontools.

Determine if there's any reason to switch from fcheck to Tripwire or something else.

Consider some of the all-in-one host monitoring systems, such as Samhain (HIDS,