#BackToBasics – RPO and RTO

18 June 2012 by Alice Cullen

It would be fair to say that without the internet, computers or technology, many of today’s businesses would flounder. Let’s think for a minute about what would happen if your business’s computer systems failed.

No email. No SQL database with all of your customer data. No website. For any modern business this could be catastrophic.

So, what can we do?

Business continuity gives your business the ability to bounce back from unexpected disruptions and disasters – this could be anything from a fire or flood to a malicious attack or hacking attempt. Business continuity isn’t something to think about at the time of the disaster though, it is a pre-emptive measure, like insuring your car or home to fall back on when the worst happens.

The first step to successful business continuity is to fully understand the level to which you should be protecting your IT against disasters. In order to do this, you need to identify the critical areas of your business: the email system that enables communication between teams; the website that drives your sales and marketing presence; the SQL server that contains your business intelligence; the list goes on.

Each of these systems or processes contributes to the day-to-day functioning of the business and a value can be attached to each. This value determines the affect that its unavailability would have on the business, and from this we can establish a cost-per-unit (of time) to calculate the cost of the unavailability of the service.

It is as important to calculate the amount of acceptable irrecoverable data loss from a system. This can be difficult to determine, as in an ideal world businesses would remain online and lose no data at all in the event of a disaster. However solutions that enable this come with astronomical costs that are simply out of the reach of small businesses and start-ups. Calculating an acceptable amount of data that could be lost without having a disastrous affect on the business helps to establish a workable business continuity strategy.

To help us understand this better, two key concepts are introduced which underpin the cost model for Disaster Recovery.

Recovery Point Objective (RPO)

The amount of data loss, expressed by an amount of time, that is acceptable in the event of a disaster. E.g. an RPO of 5 hours = a backup of data is taken at 1pm, the next is scheduled for 6pm. Any disaster occurring that affects the original data will result in a maximum of 5 hours loss of data up until the second scheduled back up at 5:59pm.

Recovery Time Objective (RTO)
This is the amount of time, set as an objective (not a mandate) to which attempts are made to restore service to the business following a disaster. This is defined by the business following detailed analysis of how long the business should be capable of surviving without a service/system/business process. E.g. an RTO of 5 hours. A disaster occurs at 10am. Services should be restored by 3pm.

Impact on cost of an incident in relation to RPO and RTO

Cost impact in relation to RPO/RTO

As shown in the above diagram, the RPO and RTO show the ‘acceptable’ time either side of when the disaster hit (shown by the red line) for full service to be resumed.

The curve of each of these lines demonstrates the increase in cost to reduce the objective amount of time it takes to recover. Showing whether it is less acceptable to be without a service for a period of time (RTO) or less acceptable to lose data (RPO) or both.

Short RPO and RTO

Short RPO + Short RTO = High cost

It is clear to see from the graph above that implementing a solution with both a short RPO and RTO comes with high costs.

To explain this, let’s look at an e-commerce site as an example: Assuming that the site is critical to the operation of the business (i.e. taking orders and payments for products to be shipped from a warehouse) we know that there’s a low RPO and RTO requirement.

To enable short RPO and RTO, we can geographically disperse the IT that provides their service across multiple datacentres, mitigating failure of one entire datacenter. This increases costs because it requires twice the amount of investment in the IT hardware and setup costs – of course, it is also a very unlikely occurrence to lose an entire datacentre.

Short RPO, Long RTO

Short RPO, Long RTO = Lower cost

The costs are lower to implement a solution which requires short RPO and long RTO. An example in this case may be a website where the data is constantly updated with important changes but is only referred to at certain times of the day. If the solution is offline and the changes can be applied once it is back up and running it is possible to afford a longer RPO and, therefore, a lower cost.

Long RPO, Short RTO

Long RPO, Short RTO = Lower cost

Like the previous (Short RPO, Long RPO) scenario, implementing a solution which requires long RPO and short RTO the costs are lower. An example in this case may be a website where the data collected (if any) is not critical to the operation of the business but the fact that the site is continually ‘up’ is crucial.

Long RPO and RTO

Long RPO, Long RTO = Low cost

To implement a solution which can afford long RPO and RTO the costs are low. Only for websites that are deemed non-critical or where budget cannot stretch, is this approach to DR is suitable – especially with the extra reassurance of UKFast’s 100% uptime guarantee and hardware replacement guarantee.

So, in summary

There is a huge range of methods for implementing business disaster recovery solutions, however large or small. Implementing the basics – such as dual power supplies or resilient disks in servers – will help to protect from the issues of hardware failure. More advanced solutions involve multiple-datacentre configurations, virtualisation and clustering to provide progressively better RPO/RTO combinations.

The key is not to forget about the potential for disaster and its potential effect on business. Assess risk, prioritise resources and build as recoverable a solution as the given budget will allow. This is where it is important to note that IT budgets will be one of the primary influences on the choice of RPO and RTO for any business.

The UKFast team are positioned to assist with any enquiries with regards to how Disaster Recovery solutions can assist your business IT, if you have any questions or concerns please give us a call.