Oracle Blog

Useful stuff for your blog-reading pleasure.

Thursday Sep 17, 2009

One of the most important features of ZFS is the ability to detect data corruption through the use of end-to-end checksums. In redundant ZFS pools (pools that are either mirrored or use a variant of RAID-Z), this can be used to fix broken data blocks by using the redundancy of the pool to reconstruct the data. This is often called self-healing.

This mechanism works whenever ZFS accesses any data, because it will always verify the checksum after reading a block of data. Unfortunately, this does not work if you don't regularly look at your data: Bit rot happens and with every broken block that is not checked (and therefore not corrected), the probability increases that even the redundant copy will be affected by bit rot too, resulting in data corruption.

Therefore, zpool(1M) provides the useful scrub sub-command which will systematically go through each data block on the pool and verify its checksum. On redundant pools, it will automatically fix any broken blocks and make sure your data is healthy and clean.

It should now be clear that every system should regularly scrub their pools to take full advantage of the ZFS self-healing feature. But you know how it is: You set up your server and often those little things get overlooked and that cron(1M) job you wanted to set up for regular pool scrubbing fell off your radar etc.

Introducing the ZFS Auto-Scrub SMF Service

Here's a service that is easy to install and configure that will make sure all of your pools will be scrubbed at least once a month. Advanced users can set up individualized schedules per pool with different scrubbing periods. It is implemented as an SMF service which means it can be easily managed using svcadm(1M) and customized using svccfg(1M).

Requirements

The ZFS Auto-Scrub service assumes it is running on OpenSolaris. It should run on any recent distribution of OpenSolaris without problems.

More specifically, it uses the -d switch of the GNU variant of date(1) to parse human-readable date values. Make sure that /usr/gnu/bin/date is available (which is the default in OpenSolaris).

Right now, this service does not work on Solaris 10 out of the box (unless you install GNU date in /usr/gnu/bin). A future version of this script will work around this issue to make it easily usable on Solaris 10 systems as well.

Download and Installation

After unpacking the archive, start the install script as a privileged user:

pfexec ./install.sh

The script will copy three SMF method scripts into /lib/svc/method, import three SMF manifests and start a service that creates a new Solaris role for managing the service's privileges while it is running. It also installs the OpenSolaris Visual Panels package and adds a simple GUI to manage this service.

After installation, you need to activate the service. This can be done easily with:

svcadm enable auto-scrub:monthly

or by running the GUI with:

vp zfs-auto-scrub

This will activate a pre-defined instance of the service that makes sure each of your pools is scrubbed at least once a month.

This is all you need to do to make sure all your pools are regularly scrubbed.

If your pools haven't been scrubbed before or if the time or their last scrub is unknown, the script will proceed and start scrubbing. Keep in mind that scrubbing consumes a significant amount of system resources, so if you feel that a currently running scrub slows your system too much, you can interrupt it by saying:

pfexec zpool scrub -s <pool name>

In this case, don't worry, you can always start a manual scrub at a more suitable time or wait until the service kicks in by itself during the next scheduled scrubbing period.

Should you want to get rid of this service, use:

pfexec ./install.sh -d

The script will then disable any instances of the service, remove the manifests from the SMF repository, delete the scripts from /lib/svc/method, remove the special role and the authorizations the service created and finally remove the GUI. Notice that it will not remove the OpenSolaris Visual Panels package in case you want to use it for other purposes. Should you want to get rid of this as well, you can do so by saying:

pkg uninstall OSOLvpanels

Advanced Use

You can create your own instances of this service for individual pools at specified intervals. Here's an example:

This example will create and activate a service instance that makes sure the pool "mypool" is scrubbed once a week.

Check out the zfs-auto-scrub.xml file to learn more about how these properties work.

Implementation Details

Here are some interesting aspects of this service that I came across while writing it:

The service comes with its own Solaris role zfsscrub under which the script runs. The role has just the authorizations and profiles necessary to carry out its job, following the Solaris Role-Based Access Control philosophy. It comes with its own SMF service that takes care of creating the role if necessary, then disables itself. This makes a future deployment of this service with pkg(1) easier, which does not allow any scripts to be started during installation, but does allow activation of newly installed SMF services.

While zpool(1M) status can show you the last time a pool has been scrubbed, this information is not stored persistently. Every time you reboot or export/import the pool, ZFS loses track of when the last scrub of this pool occurred. This has been filed as CR 6878281. Until that has been resolved, we need to take care of remembering the time of last scrub ourselves. This is done by introducing another SMF service that periodically checks the scrub status, then records the completion date/time of the scrub in a custom ZFS property called org.opensolaris.auto-scrub:lastscrub in the pool's root filesystem when finished. We call this service whenever a scrub is started and it deactivates itself once it's job is done.

As mentioned above, the GUI is based on the OpenSolaris Visual Panels project. Many thanks to the people on its discussion list to help me get going. More about creating a visual panels GUI in a future blog entry.

Lessons learned

It's funny how a very simple task like "Write an SMF service that takes care of regular zpool scrubbing" can develop into a moderately complex thing. It grew into three different services instead of one, each with their own scripts and SMF manifests. It required an extra RBAC role to make it more secure. I ran into some zpool(1M) limitations which I now feel are worthy of RFEs and working around them made the whole thing slightly more complex. Add an install and de-install script and some minor quirks like using GNU date(1) instead of the regular one to have a reliable parser for human-readable date strings, not to mention a GUI and you cover quite a lot of ground even with a service as seemingly simple as this.

But this is what made this project interesting to me: I learned a lot about RBAC and SMF (of course), some new scripting hacks from the existing ZFS Auto-Snapshot service, found a few minor bugs (in the ZFS Auto-Snapshot service) and RFEs, programmed some Java including the use of the NetBeans GUI builder and had some fun with scripting, finding solutions and making sure stuff is more or less cleanly implemented.

I'd like to encourage everyone to write their own SMF services for whatever tools they install or write for themselves. It helps you think your stuff through, make it easy to install and manage, and you get a better feel of how Solaris and its subsystems work. And you can have some fun too. The easiest way to get started is by looking at what others have done. You'll find a lot of SMF scripts in /lib/svc/method and you can extract the manifests of already installed services using svccfg export. Find an SMF service that is similar to the one you want to implement, check out how it works and start adapting it to your needs until your own service is alive and kicking.

Edit (Sep. 21st) Changed the link to CR 6878281 to the externally visible OpenSolaris bug database version, added a link to the session details on OSDevCon.

Edit (Jun. 27th, 2011) As the Mediacast service was decommissioned, I have re-hosted the archive in my new blog and updated the download link. Since vpanels has changed a lot lately, the vpanels integration doesn't work any more, but the SMF service still does.