Strategy meeting taking place where you see hands pointing to a map of IT infrustructure.

Disaster Recovery

What is disaster recovery?

Disaster Recovery, as defined by the NIST (National Institute of Standards and Technology) Special Publication 800-34, is a strategy for supporting continuity of operations during a wide range of disruptive events. The goal of business continuity and disaster recovery (BC/DR) initiatives is to ensure that your organization can continue to operate through a disaster, minimizing or precluding permanent damage to the business.

Disaster Recovery (DR) is the ability of an organization to resume operations after a significant disruptive event such as fire, flood, earthquake, storm, hazardous material release and even IT outage. The goal of DR initiatives is usually not only avoiding further loss caused by the disruptive event, but also to prevent damage to customers and other businesses. One of the main goals for DR is to ensure that key business functions can continue in a timely manner, and/or with minimum disruption to the organization.

What is a business continuity plan (BCP)?

A business continuity plan (BCP) is a set of procedures that allow an organization to continue critical functions during and after adverse events. A BCP describes step-by-step actions that should be taken in the event of an emergency, disaster or severe disruption to normal operations. The goal is to provide enough detail so that you could carry out the actions in the plan during an emergency or disruption, even if you had to do so without the aid of management staff.

A BCP should be considered a living document that evolves with your organization's risk profile and business needs. The plan may evolve some time after it has been created as you discover new threats, vulnerabilities, interdependencies and other factors.

The first step of creating a BCP is to review your organization's risk profile. This involves conducting a risk analysis which includes the following steps:

  • Identifies assets, threats and vulnerabilities

  • Assigns probability estimates to each threat or vulnerability

  • Prioritizes business functions according to criticality (high/medium/low)

  • Identifies preventive and mitigative measures for each business function

The next step is to document the plan. This involves identifying the following:

  • Necessary resources (e.g., buildings, information technology systems)

  • Key personnel (e.g., managers, subject matter experts)

  • Emergency contacts (e.g., first responders, law enforcement)

  • Procedures for prioritizing of procedures during a disruption or emergency

  • Communications requirements (e.g., public relations team members, media contacts)

The next step is to identify the disruptions that may affect your organization and prioritize them on a scale of 1 to 5, with 1 being low risk and 5 being high risk. Then, plan the recovery procedures for each business function.

Following are some common disruptions that can affect an organization and require BCPs:

  • Fire, flood or other environmental incident

  • Utility failure/supply disruption (e.g., water, power)

  • Terrorist attack/natural disaster (e.g., earthquakes, hurricanes)

  • Computer outage/failure (e.g., hard drive crash, virus infection)

  • Civil disturbance/disorderly conduct

  • Data loss/security breach

As part of the plan's implementation, it is important to review the plan regularly so that you are aware of any changes in the business environment.

The final step in creating a BCP is testing it. Testing involves exercising the plan by role-playing emergency scenarios, including mock evacuations if applicable.

An effective BCP should answer the questions below:

  • What needs to be done? - As simply and briefly as possible, identify the key tasks that must be performed to continue critical functions. Include which departments must perform those tasks and any other details necessary for those departments to understand their roles and responsibilities.

  • Who does what? - Specify who is responsible for each task identified above. If there is a specific person or group of people assigned to perform a task, note their contact information and include an emergency contact if they cannot be reached.

  • When do we start? - Identify when and how you will begin to execute the plan. If more than one person is responsible for executing the plan, note what will happen if all of them are unavailable during a disruption.

  • Where do we go? - Identify the designated meeting places and evacuation routes.

  • How do we communicate? - Specify how critical personnel will be contacted and how potential issues and updates regarding the disruption or emergency response will be communicated to those people.

  • Who notifies whom? - Note who needs to be notified during a specific type of disruption, such as a fire or a physical attack. Also identify which individuals or groups need to be notified in the event of a disruption and how they will be contacted (e.g., phone, email). This should include contact information for all appropriate first responders as well as your organization's legal counsel.

  • What do we do? - Identify specific actions that must be taken if a disruption occurs.

  • How do we recover? - Establish procedures for responding to the disruption, which may include evacuations or sheltering in place. Specify what actions should be taken at each phase of the recovery process (e.g., containment, mitigation, restoration). Note specific timeframes for various phases of the recovery process.

  • When are we safe? Identify the earliest time when employees should reenter an affected building or area.