A multi-threaded file watch utility, Fido can monitor a file or directory to see if its modification time changed, and it’s intuitive enough to recognize a change even if the daemon was down at the time of the change. Written for RedHat Enterprise Linux, Fido should run in all Linux flavors, and the source bundle contains RedHat init.d scripts for users’ convenience.

Key Features:

Content changes can be recognized with regular expression pattern matches

Good morning, JoeDoggers. Let’s bask in the glow of Your Fido this morning; he’s all grown up and ready for love. What does that mean? Well, it means it now behaves like a contemporary modern daemon. Starting with version 1.1.5, if you send it SIGHUP, it will reload its configuration file.

Really? It’s been out since 2011 and you’re only adding that feature now?

Hey, what do you want from me? It’s free, isn’t it?

Here’s how it works: if you change fido’s configuration file, you can send it SIGHUP to reload its key = value pairs. There’s just one thing it won’t reload: its filenames.

Remember, a fido configuration file is divided into two parts; it contains global settings and file settings. The file settings are distinguished by a filename followed by two brackets like this: {}. Here’s an example:

/usr/local/var/my.log {
# key = value pairs go here.
}

So if you change /usr/local/var/my.log to anything else, you’ll have to restart fido. If you change any other values, then you can just send it SIGHUP.

So how do I send it SIGHUP?

There’s several ways of doing this.

1.) You can look for the process ID (PID) with the ps command and send it SIGHUP (which is signal number 1):

Okay, that’s Google. So in order to validate and record the time of Google’s last crawl, we have to check the IP address. How do we achieve this?

We’ll use fido to check our logs for instances of Googlebot but it can’t validate the IP address. Our action program can do that but how do we pass it the address? Prior to fido-1.1.4, that would have been impossible. Starting with 1.1.4 it can now do regex capture and pass those variables to the action program.

To set this up, you’ll need to a file block in fido.conf which points to your access_log.

It’s a good thing Your JoeDog is not very bright. If he were an intelligent man, then he’d realize the difficulty of the task he was about to undertake before he undertook it. He would look at that task and think, “I’m not good enough to code that.” Fortunately, he’s not very bright and that means fido has a new feature.

New feature? Exciting!

Beginning with version 1.1.4, fido can do regex capture and pass the back references to the program it should run. Let’s look at an example config to help us make some sense of this. The new feature is highlighted in red.

In this exercise, we want to find instances of GoogleBot in the access log, but we also want to verify that it’s actually the GoogleBot and not a crawler with a forged User-agent. To accomplish that, we want to capture the IP address and send it to our program for validation.

So let’s look at the rule:

^([0-9]+.[0-9]+.[0-9]+.[0-9]+).*GoogleBot

If a line being with an quartet of dot separated numbers, i.e., an IP address, and it contains GoogleBot, then we have a match. Notice that IP address is wrapped in parentheses? That’s our capture. Everything within the parens will be assigned to $1. (BTW: Writing that was a royal PITA.) If we had two sets of parens, then the second set would be assigned to $2.

Fido will assign variables by name so it doesn’t matter which order you pass them to your program.

Your JoeDog had a requirements change. “Stupid requirements!” He had to ensure each file in a directory and all its sub-directories was less than eight days old. Unfortunately, Your Fido didn’t traverse directory trees. He stood watch only at the top of the tree.

That’s the problem with dogs: they have a mind of their own.

Without much effort, fido learned a new trick. It now recursively searches a directory for files. To leverage this feature, you’ll have to give it a command. “Recurse, boy, recurse!”

recurse takes one of two values, true or false. True means search the tree and false means remain at the top level. If you don’t set a recurse directive, then fido will treat it as false, i.e., it will remain in the top directory.

I use Mondoarchive to create Linux recovery disks. Each server writes ISO images to a shared volume on a weekly basis. If any file inside that directory is older than seven days, then a server failed to create an ISO. In order to monitor this directory for failure, I added a new feature to fido. Exciting!

Starting with version 1.1.0 (click to download), fido can monitor a file or directory to see if it — or any file inside it — is older than a user configurable period of time. If fido discovers a file whose modification date exceeds the configured time, it fires an alert.

The following example illustrates how to configure the use case above:

This file block applies to “/export” which is a directory. Since it’s a directory, the rules apply to every file inside it. In this case ‘rules’ is pretty straight forward. We’re looking for files that exceed eight days in age. This rule will always follow this format: exceeds [int] [modifier]. The modifier can be seconds, minutes, hours or days. If you take the long view — if you’re concerned about events far into the future — then you’ll have to do some math. We don’t designate years so you’ll have to use 1825 days if you want to be alerted five years out.

We also find a new feature inside this block. ‘exclude’ takes a regular expression and tells fido which files to ignore. Currently, ‘exclude’ only works inside a file block with an exceeds rule but I plan to make better use of it.

Finally we notice one final feature that we’ve never seen before. The ‘throttle’ directive tells fido how long to wait between alerts. In this scenario, fido will trigger an alert the second it finds a file which exceeds 8 days. If the problem is not addressed within twelve hours, it will fire another alert. Alerts will continue in twelve hour intervals until the problem is corrected.

I hope you enjoy these features. If there are enhancements you’d like to see, feel free to contact me either in the comments or by email.

Did you ever want to process a file immediately after it was uploaded via FTP? You could have the upload script execute a remote command after the file is uploaded. That requires shell access that you may or may not be able to grant. On the server, you could run a processing script every minute out of cron but that could get messy.

Fido provides alternative method.

Starting with version 1.0.7, Fido has the ability to monitor a file or directory by its modification date. When the date changes, fido launches a script. We can use this feature to process files that are uploaded via ftp.

In this example, we’ll monitor a directory. In fido.conf, we’ll set up a file block that points to a directory. (For more information about configuring fido, see the user’s manual). This is our configuration:

With this configuration, fido will continuously watch /home/jdfulmer/incoming for a modification change. When a file is upload, the date will change and fido will launch /home/jdfulmer/bin/process. Pretty sweet, huh?

Not quite. The modification date will change the second ftp lays down the first bite. Our script would start to process the file before it’s fully uploaded. How do we get around that? We can make our script smarter.

For the purpose of this exercise, I’m just going to move uploaded files from incoming to my home directory. Here’s a script that will do that:

In order to ensure the file is fully uploaded, I check lsof for its name. If there’s an open file handle under that name, then the script will continue to loop until it’s cleared. When the while loop breaks, the script moves the file.

There’s just one more thing to think about. When the script moves the file what happens to the directory fido is watching? Yes. Its modification date changes. In my example, process runs a second time but does nothing since nothing is there. Depending on your situation, you may need to make the script a little smarter.

I wanted to illustrate how to use fido with an example. Today we’re going to use it to count software downloads on this site. Exciting! This will be simple since we only have one data source. A few years ago, I move my software from an FTP repository onto this web server. To quantify software downloads, we can simply monitor the http access log.

This tells fido to monitor the access_log in real time. Its pattern match rules are in a file called downloads.conf When fido finds a match, it will execute a program called tally. Finally, the last directive tells fido to use syslog to log its activity.

In order to understand what we’re looking for, you should take a look at the software repository. It contains multiple versions and helpful links to the latest releases and betas. We want to match them all.

Let’s take a look at our downloads.conf file. Since we didn’t specify a full path to the file, fido knows to look for it under $sysconfdir/etc/fido/rules. If you configured it to use /etc, then the rules are found in /etc/fido/rules/downloads.conf. Here’s the file:

Each line begins with an optional label. If a label is present, fido will pass it (minus the colon) to the action program. In the example above, if the JoeDog-Config perl module is downloaded, then fido will run /usr/local/bin/tally CONFIG. For more on labels, see the fido user’s manual.