Data Protection: Setting the Right Objectives
Recognizing where there may be problems in an organization's data protection strategy is not enough. Organizations need to understand what the right objectives for the risk management part of a data protection strategy should be. Setting the right objectives is critical, but not necessarily easy.
The objectives that an organization should consider are data availability, data preservation, data responsiveness (i.e., getting the data to the user within a reasonable time), and data confidentiality. These objectives must be kept in balance with one another, because changes in one area affect others, and therefore too much focus on one objective can lead to problems meeting another objective.
How High Is High Enough for Data Availability?
High availability, in which unplanned downtime is no more than seconds or only a relatively few minutes per year, is frequently a key objective in a data protection strategy, and one of the keystones of business continuity. However, an overemphasis on high availability can lead to problems with data preservation (all the money goes into keeping the systems up, and very little goes into preventing data loss when they do go down), data responsiveness (fault-resilient storage often does not restore as quickly), and data confidentiality (all the money goes to keeping the systems up, and very little to protecting the data from unauthorized exposure). As a result, an organization may not meet its real data protection goals and probably will spend more than necessary for data protection.
High availability depends on the entire IT infrastructure (See Figure 1) and not just on the storage part of that infrastructure. For example, if a network is unavailable for any reason, data on a disk array that an application accesses over that network is also unavailable, even if the disk array is working perfectly.
Figure 1. High Availability Depends on the Entire IT Infrastructure
All applications do not have to have the same level of availability. However, for those applications that need high availability, all of the relevant components of the IT infrastructure have to be tuned to the same relative level of protection. Otherwise, a weakest-link-in-the-chain problem exists.
Under normal circumstances the overall IT infrastructure is unlikely to have been designed for high availability. High availability is a relative term. There are actually higher and higher levels of availability, where each additional level shaves off less and less improvement in overall uptime in terms of seconds or minutes. Unfortunately, as availability increases incrementally, costs tend to rise quickly. That is because moving from a single point of failure (where, if a component fails, there is no alternative component to fail over to with little or no downtime) to a situation where there is no single point of failure is expensive. That is because the second component is redundant and therefore results in more expense than was necessary to provide the original functionality without it. (Sometimes, the alternative component can share the workload with the original component, which improves performance. However, when one component fails, the remaining component has to do the work of both, which may degrade performance to an unacceptable level.)
However, if any part of the infrastructure can be significantly improved for a relatively small increase in cost, the investment is probably worthwhile.
All else being equal, incremental investment to increase the availability of storage is preferable to incremental investment to increase the availability of other parts of the IT infrastructure, such as servers or databases. The reason is that, if carefully done, investment in storage availability typically can improve data preservation as well. The data is more likely to be preserved from permanent data loss because there are one or more additional copies from which the data can be restored if necessary. That is an important side benefit that investing in another component, such as a network switch, cannot provide.
SNIA's Data Value Classification: A Point of Departure
The Storage Networking Industry Association (SNIA) has defined two key terms with respect to the availability of data through data protection (from the SNIA Dictionary at www.snis.org):
Recovery Point Objective (RPO): The maximum acceptable time period prior to a failure or disaster during which changes to data may be lost as a consequence of recovery. Data changes preceding the failure or disaster by at least this time period are preserved by recovery. Zero is a valid value and is equivalent to a "zero data loss' requirement. This is a definition of the amount of permanent data loss. Permanent data loss means that the data cannot be restored through use of IT. In some cases, manual reentry of data may be possible, but that ability may be infrequent.
Recovery Time Objective (RTO): The maximum acceptable time period required to bring one or more applications and associated data back from an outage to a correct operational state. This is the maximum downtime that an application should suffer for a single failure event. The Data Management Forum (DMF) within SNIA has further defined a third term (from the "SNIA Implementation Guide for Data Protection,' March 2004).
Data Protection Window (DPW): Like the backup window, this is the available time during which a system can be quiesced and the data can be copied to a redundant repository without impacting business operations. This is critical, because data has to be protected. The question is how that protection can be provided without unnecessary "unavailability' of systems. Building on these definitions, the DMF goes on to define five classes for data value classification and the resulting RPO, RTO, and DPW for each data value class (See Table 1).
Table 1. DMF Data Value Classification
The DMF then defines a five-step implementation guide for data protection:
- Identify data value class
- Define best solution
- Select specific components
- Heck system cost
- Confirm decision or change
The five-step process seems reasonable, but then the implementation guide adds that "if cost is too high, change data value class or specific components.' In other words, if you can't afford it, change your mind about how important it is!
That work was a valiant effort to tackle the very difficult issue of how to equate the value needed for risk-based data protection with an implementation strategy. And it serves as a good starting "straw man' point to thinking more deeply about the issue. The problem is that accepting the data value classification and implementation strategy at face value might not be the best strategy for IT management. An enterprise attempting to use that approach should think clearly about its applicability to the enterprise's circumstances. The reasons for looking closely at the pertinence of Table 1 are as follows:
- Value is not the same as availabilitymaking the assumption that they are equivalent can lead to a misallocation of data protection investment dollars.
- The RPO and RTO for operational recovery and disaster recovery are not necessarily the samemaking the assumption that they are the same can lead to a misapplication of resources when making a recovery.
- Availability is only one data protection objectivegiving availability excessive weight versus the other objectives can lead to mistakes in protecting data.♦
Read more IT Process Improvement
Certain names and logos on this page and others may constitute trademarks, servicemarks, or tradenames of
Taylor & Francis LLC. Copyright © 20082013 Taylor & Francis LLC. All rights reserved.