The “D” in HADR

HADR is an acronym that stands for High Availability / Disaster Recovery. Although you can think of these two concepts individually, I prefer to think of them on a continuum. If your system is spread out across multiple locations, for instance, it can be both highly available and able to quickly recover from a disaster.

The “D” stands for disaster, and it’s important that you’re ready for those. Most of us have a good backup plan in place, and we’ve practiced a restore operation to ensure all that works. But one thing I don’t see a lot is a set of documentation that shows what to do in a disaster – a DR Plan. I’ve written about these before, but the basic idea is some electronic or even better, paper instructions that explain what to do to recover the systems. This includes things like the location of the backups, the systems needed to restore the backups, the impact analysis and so on.

Do you have one of these? Does everyone on the team know where it is? How about the non-technical folks, in case you’re not available and they need to call some technical help in? Would they know where to find it?

In a disaster, people aren’t always thinking clearly. So make sure your documentation is discoverable, and that it’s easy to follow – at least for a technical professional.

Comments

Printed on paper; distributed to a broad but appropriate group in multiple copies (one on your desk, one at home); in a plastic bin just inside the doors to the main computer room and the DR computer room; and verbal reminders that the documentation wiki depends on SQL Server so...take the paper!

Of course, that's far from enough: you have to do failover tests, and you have to use/follow the documentation even if you know what you are doing (we usually try to have someone who does NOT know what they are doing follow it - e.g. a senior app developer who doesn't usually manage systems. What if she's the only one who can get to the site?)

The nature and size of the business, and how critical the data is to the core business function, make a difference in how this is approached.