A paranoid’s guide to backing up a working folder

Oops time

Leanpub supports multiple storage engines and a private GitHub repository is probably the safest way to the backing up your working folder. I chose Dropbox as I didn’t envision anything wrong with the automatic synchronization mechanism.

While working on my book, I accidentally managed to wipe-out half of my diagrams and all changes were instantly synchronized by Dropbox. The free-of-charge Dropbox account doesn’t offer folder-level versioning, so deleted files are simply gone. Fortunately, IntelliJ IDEA Local History saved the day and the diagrams were properly restored.

Backing up

Incidents are inevitable, so a disaster recovery plan should be a top priority from the very beginning.

One of the first options is to simply archive a copy of the working folder and store it in a different location.
As simple as it may be, this approach has some major drawbacks:

A lot of disk space is wasted, even if just a bunch of files have changed

Detecting changes requires some external tool

Disk space is not really a problem when using an external hard drive. For remote storages, a delta copying mechanism is more suitable.

Although I’m using a Windows machine, I happen to use Cygwin extensively. Even if it comes with tons of Unix utilities, some Kernel-related tools can’t be easily implemented on Windows. Without inotify, the watchman utility is out of picture.

A better alternative is to follow the revision control tools approach. With this in mind, I turned my working folder into a local Git repository. Even if the repository is not mirrored on a remote machine, I can still take advantage of the version control mechanism. Git provides ways to detect pending changes and the repository can be copied on multiple locations (addressing the single point of failure issue).

Using rsync, the Dropbox Git repository is mirrored to OneDrive as well

In the end, the working folder is backed by Dropbox and OneDrive and the version control is handled through Git. A full archive copy is also stored on an external drive (just in case).

Process automation

The only thing left to do is to automate the backup process. If cron is the de facto task scheduler for Linux systems, when using Cygwin, cron requires setting Administrative Privileges, a dedicated Windows Service and Security Policy adjustments. For simplicity sake, I chose a much simpler approach, using an infinite loop like the following:

Conclusion

A backup strategy can save you from an irremediable loss of data. By mirroring the working folder across several servers, you can access your data even when a given external service is down. Keeping track of all changes makes recovery much easier, so a Git repository sounds very appealing.

Subscribe to our Newsletter

* indicates required

Email Address *

10 000readers have found this blog worth following!

If you subscribeto my newsletter, you'll get:

A free sampleof my Video Course about running Integration tests at warp-speed using Docker and tmpfs