For the purposes of this article, we define disaster recovery as the ability to recover from a situation in which a data center that hosts SharePoint Server becomes unavailable.

The disaster recovery strategy that you use for SharePoint Server must be coordinated with the disaster recovery strategy for the related infrastructure, including Active Directory domains, Exchange Server, and Microsoft SQL Server. Work with the administrators of the infrastructure that you rely on to design a coordinated disaster recovery strategy and plan.

The time and immediate effort to get another farm up and running in a different location is often referred to as a hot, warm, or cold standby. Our definitions for these terms are as follows:

Hot standby A second data center that can provide availability within seconds or minutes.

Warm standby A second data center that can provide availability within minutes or hours.

Cold standby A second data center that can provide availability within hours or days.

Disaster recovery can be one of the more expensive requirements for a system. The shorter the interval between failure and availability and the more systems you protect, the more complex and costly a disaster recovery solution is likely to be. When you invest in hot or warm standby data centers, costs include:

Additional hardware and software, which often increase the complexity of operations between software applications, such as custom scripts for failover and recovery.

Additional operational complexity.

The costs of maintaining hot or warm standby data centers should be evaluated based on your business needs. Not all solutions within an organization are likely to require the same level of availability after a disaster. You can offer different levels of disaster recovery for different content, services, or farms — for example, content that has high impact on your business, or search services, or an Internet publishing farm.

Disaster recovery is a key area in which information technology (IT) groups offer service level agreements (SLAs) to set expectations with customer groups. Many IT organizations offer a variety of SLAs that are associated with different chargeback levels.

When you implement failover between server farms, we recommend that you first deploy and tune the core solution within a farm, and then implement and test disaster recovery.

You can choose among many approaches to provide disaster recovery for a SharePoint Server environment, depending on your business needs. The following examples show why companies might choose cold, warm, or hot standby disaster recovery strategies.

Cold standby disaster recovery strategy: A business ships backups to support bare metal recovery to local and regional offsite storage on a regular basis, and has contracts in place for emergency server rentals in another region.

Pros:

Often the cheapest option to maintain, operationally.

Often an expensive option to recover, because it requires that physical servers be configured correctly after a disaster has occurred.

In a cold standby disaster recovery scenario, you can recover by setting up a new farm in a new location, (preferably by using a scripted deployment), and restoring backups. Or, you can recover by restoring a farm from a backup solution such as Microsoft System Center Data Protection Manager 2007 that protects your data at the computer level and lets you restore each server individually. This article does not contain detailed instructions for how to create and recover in cold standby scenarios. For more information, see:

In a warm standby disaster recovery scenario, you can create a warm standby solution by making sure that you consistently and frequently create virtual images of the servers in your farm that you ship to a secondary location. At the secondary location, you must have an environment available in which you can easily configure and connect the images to re-create your farm environment.

In a hot standby disaster recovery scenario, you can set up a failover farm to provide disaster recovery in a separate data center from the primary farm. An environment that has a separate failover farm has the following characteristics:

A separate configuration database and Central Administration content database must be maintained on the failover farm.

To provide availability across data centers for service applications, we recommend that for the services that can be run cross-farm, you run a separate services farm that can be accessed from both the primary and the secondary data centers.

For services that cannot be run cross-farm, and to provide availability for the services farm itself, the strategy for providing redundancy across data centers for a service application varies. The strategy employed depends on whether:

There is business value in running the service application in the disaster recovery farm when it is not in use.

The databases associated with the service application can be log-shipped or asynchronously mirrored.

The service application can run against read-only databases.

The following sections describe the disaster recovery strategies that we recommend for each service application. The service applications are grouped by strategy.

Project Server 2010 requires synchronization between its databases. Project Server can be replicated between farms by using an asynchronous replication mechanism (asynchronous database mirroring, log shipping, or asynchronous SAN replication), but, for recovery, you must ensure that the Project database logs are synchronized as you restore.

Note

Although we recommend that you log-ship or mirror the Project Server databases to the disaster recovery farm, the Project Server service application cannot run against read-only databases. Therefore, we recommend that you do not run the Project Server service application on the disaster recovery farm until after failover. To successfully synchronize the Project Server databases on the disaster recovery farm, you must configure either time stamps or log marking for the databases.

Secure Store service application

Databases: Secure Store

Usage and Health Data Collection service application

Databases: Logging

Note

It is possible to log-ship or mirror the Logging database. However, we recommend that you do not run the Usage and Health Data Collection service on the disaster recovery farm, and that you do not mirror nor log-ship the Logging database.

Web Analytics service application

Databases: Staging, Reporting

Note

We recommend that you log-ship or mirror the Web Analytics Staging and Reporting databases. However, we recommend that you not run the Web Analytics service application on the disaster recovery farm until after failover.

The following service applications must be deployed on both the primary and failover farms, and cannot be log-shipped or asynchronously mirrored. For most of these service applications, we recommend that you deploy them and then verify that the failover farm has the same configuration settings as the primary farm. If configuration changes that affect the service are made on the primary farm, you must update the failover farm.

Application Registry service application

Databases: Application Registry service

Log-shipping the Application Registry service database is not supported.

Business Data Connectivity service application

Databases: Business Data Connectivity

User Profile service application

Databases: Profile, Synchronization, Social Tagging

The Profile, Synchronization, and Social Tagging databases cannot be log-shipped.

To provide redundancy for the User Profile service application, you must first deploy the service application in both the primary and secondary data centers.

To set up the Profile and Synchronization databases, we recommend that you recover a backup of the databases to the secondary data center and attach them to the User Profile service application in that data center.

Search requires complete synchronization between its databases and index. Because of this requirement, search cannot be replicated between farms by using an asynchronous replication mechanism (asynchronous database mirroring, log shipping, or asynchronous SAN replication).

To provide up-to-date search on a failover farm, you must run search on the secondary farm.

Important

The Search service application on the failover farm must be set to actively crawl the secondary farm. On failover, you must configure the Web application association to use the failover Search service application.

In an ideal scenario, the failover components and systems match the primary components and systems in all ways: platform, hardware, and number of servers. At a minimum, the failover environment must be able to handle the traffic that you expect during a failover. Keep in mind that only a subset of users may be served by the failover site. The systems must match in at least the following:

Operating system version and all updates

SQL Server versions and all updates

SharePoint 2010 Products versions and all updates

Although this article primarily discusses the availability of SharePoint 2010 Products, the system uptime will also be affected by the other components in the system. In particular, make sure that you do the following:

Ensure that infrastructure dependencies such as power, cooling, network, directory, and SMTP are fully redundant.