We're currently using Puppet (with the manifests and related files in SVN) to manage configuration of our *nix hosts. However, I've hit a bit of a perplexing issue.

Many of our hosts have scripts specific to the host - cron jobs, data mining scripts, scripts that dynamically generate config files, etc. I'm looking for a way to manage these scripts, specifically version them and allow a relatively simple restore solution if a host is rebuilt, while not changing the existing layout.

I can't seem to find a solution that "works" for our environment.

We've got scripts in arbitrary locations on the filesystem. Especially given the age of some of these apps, moving locations isn't an option.

Most of the developers have an edit-in-place workflow. It's unlikely that this can be changed.

SVN doesn't have a way to dereference symlinks, so I can't just create one directory with links to the scripts and version that.

The developers don't have access to Puppet repos. This needs to work within existing filesystem permissions - i.e. a user should only be able to modify the scripts they have access to, nothing else.

I guess the easiest way to explain what I'm looking for is what I wanted but doesn't exist:

SVN that handles symlinks (i.e. symlink all configs into a directory, then work as normal and have them versioned)

An easy way for SVN to manage a given list of paths or directories on a host, but leave everything else untouched.

6 Answers
6

The real issue here isn't a SCM system, it's that you don't have control over your users making uncontrolled changes in your production environment. Unless you can get your users to check in changes, configuration control isn't really the right approach. It sounds like what you really need a good backup solution that you can use to rapidly restore a working image.

If you want to be able to restore the "configuration" with puppet, instead of a conventional backup solution (which you should have anyways), you might want to look at using Jordan Sissel's fpm http://www.semicomplete.com/blog/geekery/fpm.html . fpm is really intended for generating packages (rpm, deb, etc.) but it can output a puppet module given a list of files. You could run a cronjob on each host that generates a puppet module with that hosts scripts and checks it into svn/git/etc.

FSVS is a Subversion client for Unix that allows you to use any part of your filesystem as a working copy. It stores SVN data under /var so it doesn't sprinkle your directories with .svn subfolders. Importantly, it stores Unix file metadata in SVN attributes, transparently.

Though there are SCMs that can store links, I'm not aware that any of the SCM systems handle links in this way. Nor should they IMHO! What if you wanted to work on two different branches at the same time and checked out two copies, one for each branch? (This is fairly common in my experience with programming projects.) They'd wreak havoc with each other if each pointed to the same files elsewhere on the disk.

SCMs generally set aside a directory that belongs to them. This allows multiple checked-out repos to live side-by-side without interference, and makes the boundary between SCM-managed and non-SCM-managed files clear.

Instead of links, I'd build a solution around copying and explicit deployment operations. To handle your case, I'd write sync scripts between the deployed locations of your scripts and the SCM-managed source directory (which I'll call the "repo"), so that you can suck in-place edits of your scripts back into the repo. I'd also work on a simple-to-run deployment script (Puppet-based?) that copied from the repo to the target locations for testing.

While I agree it would likely be better to re-train your developers to make their edits in Subversion, you could manage a reasonable hack of this without too much effort.

I'd start by making a directory or repository (depending on what makes the most sense in your environment) that has a subdirectory per host you wish to manage this way. Underneath those directories, place the files you wish to manage in a structured way. Personally, I'd do it like so:

per_host_confs/
host1/
etc/
cron.d/
foo
logrotate.d/
bar
var/
...

Then add a cronjob (or add a puppet rule) that runs a script that checks out the current trunk of the appropriate host's directory (the $hostname variable will be helpful here in your puppet recipe) to some temporary directory, does a find on that directory and copies the relavant files into place in the working copy, then commits it.

The puppet rule for redeploying it could do something similar and use a creates or check modification times to determine when to push files from the working copy/repo onto local disk.

Can you do the symlinks the other way around. Have a directory managed by svn (say /usr/local/scripts/) and then copy all your scripts into that directory (or subdirectories) and put a symlink in the old location. Then whenever someone edits the file, they actually edit the file in the svn directory.

If sym links won't work that way around then you could use hardlinks.

Once you've done that you can try and encourage your staff to edit the files in the new place, and gradually standardise on the new directory structure. As a backstop you could have a cron job that does a check in each night, and emails you so you know who is not using the new system, so you can give a friendly reminder to the relevant person.