In our daily lives, we strive to protect ourselves from the worst. We buy various forms of insurance and protection for our cars, homes and health, and we safeguard our personal information. Shouldn't business owners and information technology (IT) managers treat their networks and critical infrastructure the same way?
Despite the compelling imperative "protect your IT or be prepared to suffer devastating business interruptions," the majority of businesses under-invest in business continuity (BC) and disaster recovery (DR) planning. What are the results of a failure to protect your infrastructure properly? A U.S. National Archives and Records Administration study found that 25 percent of companies experiencing an IT outage of 2 to 6 days went bankrupt immediately, with even more following in the longer term.
Business owners and executives juggle a number of projects each day that draw on their time and resources. As a result, they tend to defer business continuity into the "solve tomorrow" column until right before (or right after) an incident. This is a critical, sometimes disastrous mistake. Like all business-essential IT programs, designing and implementing a functional continuity plan is a multi-month process. Here is why:
- Business Continuity is a Business Process: A functional business continuity plan is more about understanding and protecting key business processes than it is about managing IT assets. As such, it will require input from key business leaders and will necessitate in-depth planning and preparation so that every person in the organization knows what to do in the event of an emergency
- Assessment and Design: Developing the core business continuity plan is not a one-person job; it requires input from a cross-functional team that includes sales, communications, finance, back office, human resources and IT leaders. Without that input, it is literally impossible to correctly prioritize and tier support systems such that they will meet demands during a disaster incident
- Back Order of Critical Elements: In recent years, the back order log for business-sized power generators from reliable manufacturers in many cases was 30 weeks. That means if you order today, your generator will be available in seven months. If you do try to substitute this essential element with a generator from your local Home Depot, you will find that the power from household generators is too unstable for use by IT equipment without some type of power cleaning device. In some cases, the power simply won't turn on, while in others you risk permanent damage to your assets
- Entering the Telecommunication Queue: Most telecommunications providers offer very slow service levels for connecting backup lines to businesses, for several reasons. As the backup service provider is (we would hope) different from your primary provider, you will have to initiate a new business relationship, including all of the associated legal and administrative hurdles.
Also, as the backup line will not represent a sizeable business opportunity, you will have to wait in line behind the more profitable opportunities - including some that enter the queue after you do
- Implementation: Once all of the pieces of the continuity solution are in place, building the system, connecting it to ongoing IT programs and aligning it with the corresponding business processes takes time. Assuming that the IT staff (or person) will also have to focus on their regular job at the same time that they implement the new system, view this commitment in days or weeks as opposed to hours
- Temporary Relocation: In many disaster scenarios, resuming operations at the same location will no longer be possible. Within that reality, it will be necessary to have plans and agreements in place for a backup location as well as any logistical considerations necessary for resuming operations in the new facility. This also involves pre-planning the IT layout for the backup location to ensure that all of your critical systems will function at the secondary facility
Getting It Done: The Roadmap
The link between business continuity and disaster survivability is significant. If you are inclined to agree, we recommend that you get started today with the following steps:
- Conduct a business impact assessment: As mentioned above, convene a cross-functional team to evaluate the business requirements and tier data based on its importance to operations
- Establish a downtime threshold: When building a DR plan, the first objective should be to decide the recovery point objective (RPO) and recovery time objective (RTO). The RPO dictates the allowable data loss, while the RTO is the amount of time applications can afford to be down - the maximum tolerable outage. These provide critical context for the remaining steps of the process
- Take steps to protect data: Organizations should back up data frequently to ensure records are kept, and consider upgrading to a faster version of backup equipment to reduce the time it takes to complete a backup cycle
- Review power options: Organizations should add uninterrupted power supplies (UPS) for critical servers, network connections and selected personal computers to keep the most essential applications running
- Consider telecommunications alternatives: Telecommunications backup must involve both redundancy and alternatives. In the case of spot outages, redundancy may be enough. For larger outages, alternative communications vehicles, including wireless phones, wireless data cards and satellite phones, should be considered.
- Form tight relationships with vendors: A strong relationship with hardware, software, network and service vendors can help expedite recovery, as these vendor contacts often can work to ensure priority replacement of critical telecommunications equipment, personal computers, servers and network hardware in the event of a disaster. This is especially important for small- and medium-size organizations, which may lack the resources that larger companies can tap in an emergency
Test Your Disaster Recovery Plan
Your business continuity system is only as good as its last test. Like flashlight batteries, smoke detectors or brakes, you don't want to find out about shortfalls during an emergency. Regular and systematic testing in a number of different situations will consume time and effort, but it is the only way to know if systems are functioning properly. Plan to extend tests over weeks and months to make sure that the system aligns fully with business operations
Once the downtime threshold is established and the DR plan is in place, periodic testing should occur. Testing equals time and money, so the frequency with which an organization can test depends on the DR budget. As a benchmark, SMBs should test no less than twice annually. If it is impossible to test the entire system more than twice a year, organizations should periodically test the most critical applications and systems. Further, tests should be conducted during busy seasons and should be unannounced to all but a few personnel in order to simulate the urgency of a real disaster. Lastly, IT managers should review the process after each test to establish what worked and what did not, so any errors can be rectified.
In the end, there is no guarantee against a natural or man-made disaster, only the very high probability that you will fail without a detailed business continuity and disaster recovery plan. Though the time and resources required will pull against other organizational priorities, executives must dedicate the time to ensure business survivability. Tomorrow is already too late.
Business Continuity and Disaster Recovery Plans: Things Overlooked
What to Expect When Expecting a Disaster
Ten Tips for Successful IT Disaster Recovery Planning
So, You Want to Write a Disaster Recovery Plan?
Granular Application and System Recovery for Virtual and Physical Environments
About the Author
Firooz Ghanbarzadeh is Director, Technology Services & Solutions at CDW Corporation.