disaster recovery testing

Make Sure Your Disaster Recovery Plan Isn’t Just Words on Paper

A written disaster recovery (DR) plan is a good start towards making sure your business can resume operations after an outage, but you won’t know how good those words are until you put them into action. Because you don’t want to find out your plan is incomplete or incorrect during a crisis, it’s important to schedule periodic disaster recovery tests to try out your plan before you need to execute it for real.

Types of Disaster Recovery Tests

There are several different ways you can test your plan:

  • Circulate for comment. Distribute the plan to everyone who would participate in it and solicit their comments and feedback.
  • Walkthrough the plan. Gather everyone who would participate in the plan in a conference room or on a conference call. Read through the plan as a group—out loud, not silently. Because there is group interaction in this approach, you’re likely to surface issues that won’t be identified when individuals read through the plan separately.
  • Tabletop testing. Similar to a walkthrough, the participants are gathered together. Rather than read through the plan in isolation, they are presented with a typical failure situation and called upon to resolve it. This can identify planning gaps and failures that are not addressed by the DR plan. It’s important to choose realistic failure scenarios and that the participants are not informed of the scenario in advance.
  • Parallel test the plan. Bring up the disaster recovery systems and test whether they can execute a day’s work. The production systems run in parallel, so the only impact on routine business is that some personnel have to perform tasks on the disaster recovery systems.
  • Failover test. Simulate a production outage by gracefully shutting down the primary servers and failing over to the secondary site. This test method impacts ordinary production work so it may be better to execute this process on a weekend or other low volume time period. This process requires additional work to bring the primary servers back online after the test is complete.

Learn more in Craft An Effective Disaster Recovery Plan.

Disaster Recovery Test Follow-up

Whichever test strategy you choose, the test process isn’t over when the final system is brought back online. After the test, the DR plan needs to be updated to reflect:

  • missing applications. It’s not uncommon for applications to be overlooked when the DR plan is written.
  • missing or incorrect steps. The processes for bringing up applications may be missing some steps, miss some dependencies, have steps in the incorrect sequence, or contain errors in the details of the commands to be executed.
  • incorrect timings. Every application should have a recovery time objective which the recovery plan attempts to meet. If the test shows recovery can’t meet those objectives, the plan needs to be revisited to determine how it can be altered.
  • missing communication. Plans often fail because important notification steps are omitted.

In addition, you should always consider how the plan would have worked if this was an actual, unscheduled outage.

Learn more in Don’t Improvise Your Way Through Disaster Recovery.

Repeat the Test

If there were major failures during the test, take time to revise the plan to reflect those problems and then schedule another test to verify the corrections. If the recovery process mostly worked as planned, you can wait until your next regularly scheduled test—usually annually, though some prefer twice annually or even quarterly—to test the update.

CCS Technology group offers disaster recovery planning services. Disaster recovery testing is an important part of your business continuity strategy. Contact CCS Technology Group to learn more about writing and testing your DR plan.