Monday, October 18, 2010

Creating parity files to protect against data loss

You'll have heard about RAID 5, where a file is written to disk 3 ways so in the event of the loss of a disk you can get your file back. And of course there's its more resilient cousin RAID 6 where the data goes 4 ways and you can sustain the loss of 2 disks.

I was wondering how to do this on demand so I could take advantage of 'cloud' storage. Microsoft offer 25Gb free. And to tie this is with backing up only with RAW photo files. I can already identify and gather them together and encrypt them.

One way is to use the Parchive (parity archive volume set) idea developed to transmit files over Usenet. You need to start with a file that is already split into smaller chunks, and RAR is the best. This howto is focussed on OSX, I guess other OSs would be similar.

The reason for this is that I want to split the file into 2 near-equal sized pieces, and doing a compress and split in the same step means its hard to work out the size of the chunks.

Step 2 is it split this file into 2 pieces. I know that RAR'd it is 53.5Mb, so half will be about 26.75Mb. After some trial and error I find a good value -v26250 to be a good value. Of course what is actually happening is that this sets the size of the first chunk and the second will be the remainder. If I get 3 files I've used a value thats too small.

$ /applications/rar/rar a -v26250k photos.rar Photos.dmg.rar

and this creates 2 files.

Next get MacPAR deLuxe. Open the app and it will open a new PAR 2 window. Drag in your two files, or do Edit > Add Files or shift-apple-F.

Then select File > Create PAR Set or shift-apple-A

The next dialog is asking you how much parity data to generate. You need to put in over 50, for safety I choose 55. The less equal you chunks, the greater the %ge you need.

We get a whole bunch of files which represent the parity data

You can test that you can recover you data is either chunk 1 or 2 is lost by removing one of those chunks from the directory, and double-clicking on the .par2 file. It will load into MacPAR deLuxe and you should get the 'missing but can be recovered' message.

Now, RAR the parity files together. Its easiest to move them to their own folder, then run