Which Disaster Recovery Site Strategy Is Right for You?

Mike Chapple is associate teaching professor of IT, analytics and operations at the University of Notre Dame.

Will you be prepared if disaster strikes your organization? That’s the million (or multimillion) dollar question for IT professionals engaged in disaster recovery (DR) planning. Providing an affirmative answer requires the investment of time and money to build a solid disaster recovery plan.

One of the biggest issues facing disaster recovery planners is the selection of an appropriate DR site type. As with many IT decisions, planners are faced with a cost vs. performance tradeoff. The more spent, the greater the capability of a site to quickly resume operations.

When making these decisions, planners should consider their organization’s objectives and the criticality of an immediate or near-immediate resumption of operations to the continued viability of their organization. Financial institutions, healthcare institutions at universities and other organizations engaged in time-sensitive computing activities may be willing to make sizeable investments to avoid a few hours of downtime that would seem unreasonable in other environments.

Cold Sites: Trading Availability for Cost

The cold site is the bare-bones approach to disaster recovery. These facilities have the basic infrastructure needed to run a data center, such as heating, ventilation and air conditioning (HVAC), power and network connectivity, but not much else.

Cold sites are designed to provide coverage for long-term outages of the primary site, such as those caused by a building fire, hurricane or other major disaster that renders the primary site completely inoperable. If a disaster does occur, an organization must then acquire the hardware necessary to resume operations, build systems, install applications and load data from backup tapes. It should be no surprise that the recovery time for cold sites is measured in days or weeks rather than in hours.

Alternative Locations for Cold Sites

If a cold site is the right strategy for your organization, there are some creative options to consider. It’s not necessary to purchase or lease a facility that sits unused until it’s needed. There are two major alternatives that can save significantly.

One option is to reserve space in a third-party cold site that will be available for use in the event of a disaster. If you do choose this approach, make sure the contract contains enforceable provisions that guarantee access to the facility when it’s needed. And don’t wait until a disaster strikes before learning that the site operator triple-booked the space. For similar reasons, consider contracting with a facility located a great distance away from the primary site. In the event of a regional disaster near a main data center, chances are there will be significant demand for recovery sites in that vicinity, if they are even still operating.

A second option is to create a cold site in a facility already owned by your organization and used for another noncritical purpose. Put the basic required infrastructure in place, and then reuse that space for offices, storage or other purposes until there’s a need to activate the cold site.

Warm Sites: The Middle Ground

For many organizations, the long activation time required to stand up a cold site presents an unacceptable risk. Warm sites address this by moving beyond the basic infrastructure provisioned at a cold site to include the hardware necessary to restore operations. Depending on the nature of the warm site, administrators may also choose to have the hardware loaded with the operating systems and/or applications required to resume operations.

Warm sites also include a copy of the organization’s data in some form. This could be as simple as storing backup tapes at the site and planning to restore all data from tape in the event of warm site activation, or it might be more advanced and include storage systems with replicated copies of the data.

The time required to activate a warm site depends on many of the decisions made when configuring the warm site:

Is the organization’s data stored on a storage system that can be directly accessed by servers at the site, or does it need to be restored from tape?

Are operating systems already loaded on the hardware at the site?

Are applications installed on those systems as well?

In cases where the answers to all of these questions are yes, an organization can typically activate the warm site in a matter of hours. In other cases, it may take several days to get the site up and running.

Hot Sites: The Cadillac Experience (and Price Tag)

Hot sites provide the ultimate disaster recovery experience, with instantaneous or near real-time recovery of operations when the primary site fails. Hot sites build upon the warm site concept by taking it to the next level: ensuring that systems at the site are preloaded with operating systems, applications and the data necessary to resume operations. The significant investment of time and money required to stand up a hot site provides the organization with the ability to resume operations in minutes or seconds after a disaster disrupts operations at the primary site.

Some organizations that have a handful of critical systems and processes choose to implement a hybrid approach that uses hot site capabilities for a small number of essential services and a warm site approach for other systems that have a longer maximum tolerable downtime. This allows disaster recovery planners to focus scarce resources on the most critical processes without completely neglecting other services.

If considering a hot site, investigate the impact on your software license agreements. Depending on the language in your vendor contracts, you might be required to purchase additional licenses (at full price or a reduced rate) to support the hot site, as the applications will be in operation. Some license agreements allow for the installation of software at a hot site at no additional cost, provided that only one site is in operation at a time.

Selecting an appropriate disaster recovery site is one of the most important decisions made during the disaster recovery planning process. It’s important to carefully consider your specific needs along with the costs and capabilities of each site option when identifying the best approach for your organization.