Skip to content
Data Retention

About Time: Deciding When to Start Your RTO Countdown

Written by: Richard Long

Be notified when we post.

Relevant Contents

Need Tailored Business Continuity Insights?

Contact Us Now for Personalized Guidance!

Many organizations lack a clear, recognized understanding of when the metaphorical switch will be flipped to start the recovery time objective (RTO) countdown timer. There are two options, either of which can work provided the organization takes a few key considerations into account. 

Related on MHA Consulting: All About RTOs: What They Are and Why You Have To Get Them Right

A Common Source of Confusion 

A common source of confusion at many organizations is when the countdown for the organization’s RTOs begins.  

An RTO is a time window within which, in the event of an outage, a critical business process or application needs to be returned to a fully productive state in order to prevent an unacceptable level of harm to the organization (as previously determined by a business impact analysis).  

Note that the problem under discussion is mostly an issue with highly critical processes that have very short RTOs, such as four hours or less. (This discussion also pertains to outages resulting from major events, not day-to-day availability issues.) 

Typically, some people will assume the countdown for the RTO begins at the time of the outage. Others will operate on the understanding it doesn’t begin until a disaster or recovery event is declared.  

One consequence of this confusion is worry and frustration among people who incorrectly think the organization is at risk of missing or has missed an RTO.  

A more fundamental problem is when a lack of clarity about the company’s preferred approach leads to RTOs that don’t allow for the necessary decision-making time on the part of senior management. (More on this below.) 

Outside the Scope of This Discussion 

As stated previously, this discussion mostly pertains to highly critical processes and apps with short RTOs.  

However, within that group there is a subset of processes and apps—usually very small—that are so critical they can never be down (or if the RTO is missed by even a few minutes there will be significant harm).  These stand apart from the current discussion because they should already be architected to be in a high availability state. 

This blog is about functions that have fairly low RTOs but do not require immediate recovery. 

Starting the Countdown at Formal Declaration 

Most organizations choose to have their RTO countdown begin at the time a recovery or disaster is formally declared. Such a declaration can be made within minutes or take over an hour.  

This can be considered the standard approach. 

The reason that a lengthy delay might occur before the recovery is declared is because crisis teams and management need time to investigate the outage and decide whether it’s worthwhile to perform a recovery, a demanding and expensive undertaking. 

Starting the Countdown at Event Time 

The other possible approach is to have the RTO countdown start automatically at the time of the outage. 

Organizations that use this method will still need time to analyze the outage and decide whether to mount a full recovery.  

However, with this approach, the time consumed by investigation and decision-making eats up part of the RTO window, leaving that much less time to recover the app or process. 

This is a less common approach but some organizations might have good reasons for doing it this way. 

Achieving Success with Either Approach 

Both methods can work provided the organization takes the following points into account: 

  • In both methods, leadership should try to conduct their assessment and make a decision about recovery as quickly as possible. 
  • The preferred timeframe for deciding whether to do a full recovery should be included as part of  the BIA and recovery plan development and stated in the crisis management plans. 
  • Organizations that opt to start the RTO countdown at event time need to budget time for analysis and decision-making in their RTOs. 
  • RTOs are guidelines, not hard deadlines. (Any process or app where the cost of going slightly over the RTO would be severe should be moved to a shorter RTO category to ensure it will not miss the required recovery.) 
  • Whatever approach the organization decides on, the decision must be clear and widely communicated and understood throughout the organization.  
  • The chosen approach should be incorporated in the BIA and factored into the setting of RTOs. 

By taking these items into account, an organization can achieve success no matter when it decides to start the metaphorical countdown timer for its RTOs. 

The Importance of Clear Communication 

At many organizations, confusion reigns regarding when, in the event of an outage, the RTO countdown timer begins. This confusion can have consequences ranging from unnecessary turmoil to unrealistic and badly missed RTOs. 

There are two possible approaches to deciding when to initiate the RTO countdown. The standard approach is for the timer to start when a recovery is formally called or approved. The other possibility of having the countdown start automatically at the time of the event might work  for some organizations.    

Either approach can work provided a handful of key considerations are taken into account. These include referencing the chosen approach when setting the individual RTOs and making sure it is clearly communicated throughout the organization. 

Further Reading 

For more information on RTO countdowns and other topics in BC and IT/disaster recovery, check out these recent posts:

 


Start building a stronger future

Navigate uncertainty with an expert - schedule your free consultation with our CEO, Michael Herrera.

Other resources you might enjoy

The 7 Habits of Highly Effective Crisis Managers

The recent hurricanes in the southeastern U.S. provided a...

Exploring DORA: The EU’s Excellent New Digital Resilience Standard

The European Union’s tough new regulation covering the...

Ready to start focusing on higher-level challenges?