Digital Backups

Backing Up Digital Photographs and Media

Memories stored digitally are extremely fragile but their digital nature allows many ways to protect them. Loss can potentially occur in many ways: destruction, theft, degradation and obsolescence. Such causes can be greatly reduced by a well thought out backup strategy. It would be probably impossible to guaranty that digital files last forever and survive any catastrophe but one can reduce the probability to nearly zero with sufficient effort.

This article covers different options for dealing with each type of potential loss. It is intended for backing up digital photographs but the principles applies to all digital media. This is the 7th step in the digital photography workflow and the most important.

Strategy

As the old saying goes, any plan is better than no plan. It is far more important to have a backup strategy than to find the perfect one. The basis of backups is to have more than one location where digital files exist. The choice of location and storage system determines how durable and resilient the strategy is. Strategies require varying amount of effort. It is therefore quite important to choose one which is feasible given your free time and resources.

Duplication

The simple act of duplication images reduces the probability of losing them. Even duplicating on the same media is a way to reduce the chances of corruption due to bit degradation. A RAID (Redundant Array of Inexpensive Disks) is automatic way to create duplication within a single physical unit using a set of identical drives. RAID-1 uses two drives and keeps all data identical on both. This only provides protection against disk crashes and corruption. It does however provide the very fastest speed of recovery because a duplicate disk is always accessible. This is known as redundancy which is different than a backup as it does not protect against accidents. If a file gets erased, it would automatically get erased from both disks. Therefore a RAID alone is not a backup solution.

Duplication on unsynchronized disks is essential for a backup solution as modifications to original files are not propagated to duplicate copies. Hard-disks, optical disks, cloud-storage and tapes are all appropriate media for duplication. Each time a backup needs to be created, files should be copied to a chosen medium:

Hard drives: Hard drives are the fastest medium to write to when accessed locally. They provide the largest capacity but are the most susceptible to physical damage.

Optical disks: These are slower than hard-drives but much more robust. Write-once versions and even re-writables ones are hard to accidentally erase. Capacities are limited compared to other media types though.

Cloud storage: Those are usually a number of different storage types managed by a third-party company and accessed through a network. Capacity and durability varies by service provider.

Tape: Old fashion magnetic media with relatively large storage capacity and extreme durability in controlled conditions but they are the slowest media to access.

Each additional copy provides extra protection. The first one obviously gives the most incremental benefit but a third copy is quite beneficial as well. Additional copies give diminishing returns, given that the first three are well thought out.

Incremental

Since digital images are captured once and should really never be modified, they are perfectly suitable for incremental backups. An incremental backup is one where new files are stored in new backups and old files generally remain untouched. This greatly reduces the time and resources needed to produce backups. Still, occasionally non-incremental backups are needed to protect against obsolescence and corruption which are discussed further.

With incremental backups, each time a new backup needs to be made, all new files since the creation of the last backup are copied into the backup medium. With both hard-disk and cloud-storage this normally adds files to existing storage. On optical disks and tapes which are slower to write to, typically a new disk or tape is used for each incremental backup.

The choice of increments is personal. The trade off is that small increments imply more work while large increments increase the window of opportunity for files which are not yet backed-up to be lost or damaged. The increment size should be chosen depending on the desired risk level. One can set the increment based on time or storage size. So either backups occur every so often (weeks or months) or they are created every time a certain amount of images has been produced (every 4.5GB for example).

Distribution

Distribution refers to the physical distance between backups. A widely distributed set of backups protects against serious catastrophes such as natural disasters and destruction of property from fire, water and other uncontrollable elements.

The principle is simple. The greater the distance between original data and its backup, the least likely an event will occur and damage both copies. Ideally, data should exist in two or more separate buildings. Surviving the destruction of an entire city requires much more effort unless using cloud-storage services. Most cloud-storage provides data centers in different regions and replicate between 3 and 27 copies. Some providers are known to use military bunkers as well.

The physical separation of backups does cause recovery to be considerably slower since the remote location needs to be accessed either physically or via network which can take quite some time in either case. This is when a three copy backup system is very advantageous. Once copy can be kept near the original data and the other can be kept far away.

Loss & Destruction

It does not take much to lose or destroy a massive amount of digital data. Given that a 1 TB drive can hold over one million photos, a short fall of such a drive can be devastating. Solid state disks are much more robust mechanically but can be damaged by fire, water and falling objects. DVDs are quite sensitive to scratches due to their exposed surfaces and can be destroyed by heat. As said already, distribution is the best way to minimize the chances of physical damage.

Unintentional loss can happen too. Thieves are most likely to steal valuable items like a computer, a laptop or an external storage device. Optical disks have the least value but it does not mean they wont be stolen as part of something else. For example, if someone steals luggage, anything in it, no matter its value, is gone.

Corruption

Over time errors appear on digital media simple because the medium gets older. Bits change randomly which can cause one or more files to become unreadable. This is one problem which cannot be avoided. The use of parity bits can help restore a few missing bits which is what RAID-5 does.

Unfortunately digital media is relatively young and no one truly knows what would happen to it after hundreds of years. CD and DVD manufacturers claim longevity of 200+ years for some archival disks. In practice though numerous independent tests on actual media has shown bit corruption to appear in as little as two years even in controlled storage conditions, which frankly most people cannot even reliably control.

One can store a checksum along with backups to verify their integrity. In the event that a discrepancy is detected, which should be done every year or two, a new copy of the corrupted backup must be made from the original data. Having three or more copies of data also helps with this. When compared to each other, the two which are still identical are most-likely corruption-free. This is not guaranteed though, as there is remote possibility that two copies got corrupted exactly the same.

Obsolescence

Sometimes referred to as bit-rot, obsolescence stems from the speed at which technology changes and the high cost of maintaining backwards compatibility in terms of physical and logical layout. One one side, the physical form of storage has to change over time, like the shape of memory cards and the materials used in digital storage. On the other side, formats for storing data change and data only remains usable until no software is capable of interpreting it. Both these changes have occurred in the past. Software compatibility is easier to maintain but is motivated by the amount of data that exists.

Fighting obsolescence requires ongoing effort and there is no solution that is foreseeable eternal. The simplest way to stay ahead is to regularly renew all backed-up data onto the latest mainstream media. This also fights corruption if done often. Using mainstream media increases the odds that it will have greater longevity and that effort to maintain compatibility will be made. When file-formats change, it is important that renewed backups be made using a recent format.

Prints may not be entirely obsolete. While they do degrade, the fact that prints show images directly, without the need for a reader and software, makes them very accessible when compared to digital media. Restoration work may be needed years later though.

Backup Solution

Implementing a regular backup strategy is an absolute must to preserve digital images. Either set a recurring time or an amount of new data between each backups. Using a data limit is good because it adapts to the speed at which images are produced. For doing backups on optical disks, it is easy to set the data limit to the size of the disks, so 4.5GB for DVD or 25GB for Blu-Ray.

The first step is to organize files so that they are easy to copy and avoid including unnecessarily files that will be a burden. In the introduction the Digital Asset Management, we said that Delete is your friend. It pays well to follow this philosophy in any backup strategy. Avoid having directories larger than the media used for backups.

If using a 3-way backup system, make a local backup first for efficient access and immediate duplication. This can even be done at a higher frequency than the off-site backup.

Then, at each scheduled backup, copy all new image files to the backup medium. As soon as possible, move the backup medium away from the original data, preferably in a different building or city. Having a locked drawer in an office building is a good option. Using a bank safety deposit box is good too but consider the size of your media when renting the box.

Every few years recreate all backups from original copies to avoid potential corruption. If a new media became mainstream since the last full backup, copy everything to the new media, optionally converting losslessly to a new format if the old format cannot be read natively anymore. At that point it is important to redo all backups.

Camera Bag

Your camera bag is empty. To add a camera or lens click on the star next to its name.

Field Backups

Field backups are usually temporary backups created before the cataloging and assigning of images to their final location. This is normally carried out while away on assignment or vacations to minimize the potential loss of work in progress.

It is easier when this is carried out using your own stuff rather than relying on local photo stores or Internet cafes. The most common option these days is to use a laptop with plenty of storage built-in. Additional storage can be found in external hard-drives, media-players and optical disks.

To avoid carrying something as bulky as a laptop, a few stand-alone options exist that can transfer from memory cards directly into a portable hard drive or optical disk.

Obviously the use of optical disks is ideal since this type of field-backup allows for easy and low cost replication as well as distribution. Burned optical disks have virtually no value and therefore are unlikely to be stolen when not in a costly device. They can also be distributed by using the post-office to mail copies to yourself or to friends.

The Cloud

Cloud storage has the potential to create self-supporting backups that guard against corruption, loss and natural destruction. These solutions even guard against media obsolescence but not against outdated file-formats.

The biggest issue is the longevity of the cloud service. It can theoretically last forever if successful enough but as all companies built for profitability, a service may cease to exist due to bankruptcy or simply be discontinued.

Beware of terms of service of any cloud-storage solution. What happens to data if payment is not received on time or not processed correctly? What happens after death? What about access rights after the service provider goes through an acquisition, a change of ownership or change of service? Sure, it can relieve a lot of effort but there are still many issues which are unclear.

There is a service called SnapHaven which I have never used nor have any affiliations with which says it will keep your images for at least 99 years and is partnered with the Data Permanence Foundation which would get the images in-trust should SnapHaven die before them.