RAID disk monitoring on Xenserver with email alerting

Unlike ESXi, Xenserver doesn’t really have much health monitoring built in (especially not the free license versions). Xenserver has always been the light weight, thin, streamlined and better performing virtualization product; though as time progresses Citrix is slowly adding more features and modifying their licensing structure to be more competitive. But fore now we’ll handle our monitoring the old fashion way; and that’s perfectly fine – nothing like being an actual systems administrator that knows how to work with the products you employ.

Please be sure to check out our article on S.M.A.R.T disk monitoring as it should be used in conjunction with this article for Xenserver. And of course, always use external backups along with your RAID.

While we’re using a 3ware (LSI) 9650SE RAID card, it should be relatively similar for other physical raid devices. We’ll assume that you already have the correct driver loaded for your RAID card and your Xenserver system works.

Download the Linux CLI tools for your RAID card

First thing you’ll need is a set of command line tools for your RAID card, typically this is provided by the hardware vendor in their support/download section.

You can download directly to your xenserver hypervisor using wget, or SCP the tw_cli file from another machine. These command line tools are the core of what we need to monitor the status of our RAID array. There’s no special install instructions, just # chmod 750 the tw_cli file in order to execute it.

Now to check the status of our array manually. (Please see the built in help with your CLI tools for usage)

This is the basic output which gives us information on our current array. What we’ll be looking for is a change in the status field to “DEGRADED”, as this is when we’d like to be alerted. Other statuses such as “REBUILDING” can also be used to notify when an array is being rebuilt.

Creating a Custom Monitoring Script

Our custom Xenserver RAID monitoring script is pretty basic, as all we really want to know at this point is when it hits a DEGRADED status. Of course emailing us when that happens.

You can use this script as a starting point to your own custom monitoring, though for us this is sufficient. And we setup a cronjob to query our monitor-raid.sh script every 15 minutes. If you’re implementing this on multiple servers, change the email subject accordingly to include the hostname. (Note: make sure monitor-raid.sh is executable)

Creating a CRON job

Creating a cron job is fairly easy, # crontab-e and add the following line (:wq will write and quit when finished)

*/15 * * * * /root/monitor-raid.sh >/dev/null 2>&1

And that’s it. Now our script checks the array every 15 minutes and only emails us when it’s degraded. Simple and effective.