Disaster recovery: Where time matters

(c)iStock.com/ziggymaj

Disasters can strike at any time. They may be caused by human error, cyber-attacks or by natural disasters such as earthquakes, fires, floods and hurricanes. Even so it’s quite tempting to sit back and relax, to not worry about the consequences of these upon one’s business – perhaps for cost reasons, but investments in business continuity are like an insurance policy. It’s not just about disaster recovery because the best way to prevent downtime is to keep a step ahead of any potential disaster scenario.

Yet when unforeseen incidents do occur, the organisation’s disaster recovery plan should instantly kick in to ensure that business continuity can be maintained with either no interruption or a minimal amount of it. An e-commerce firm, for example, could lose sales to its competitors if its website goes down. Downtime can also damage the company’s brand reputation. For these reasons alone business continuity can’t wait, and so large volumes of data need to traditionally have a batch window for data for backup and replication. This becomes increasingly challenging with the growth of big data.

Avoiding complacency

So are organisations taking business continuity seriously? They are according to Claire Buchanan, chief commercial officer (CCO) at Bridgeworks: “I think that most businesses take business continuity seriously, but how they handle it is another thing”. In other words it’s about how companies manage disaster recovery and business continuity that makes the difference.

These two disciplines are in many respects becoming synonymous too. “From what I understand from Gartner, disaster recovery and business continuity are merging to become IT services continuity, and the analyst firm has found that 34% of inbound calls from corporate customers, those that are asking for analyst help, is about how they improve their business continuity”, she says.

Phil Taylor, Director and Founder of Flex/50 Ltd concurs with this view, stating that a high percentage of organisations are taking disaster recovery and business continuity seriously. “Businesses these days can’t afford to ignore business continuity particularly because of our total dependence on IT systems and networks”, he says. The on-going push for mobile services and media rich applications will, he says, generate increasing transaction rates and huge data volumes too.

Buchanan nevertheless adds that most businesses think they are ready for business continuity, but once disasters actually strike the real problems occur. “So what you’ve got to be able to do is to minimise the impact of unplanned downtime when something disruptive happens, and with social media and everything else the reputational risk with a business not being able to function as it should is huge”, she explains. In her experience the problem is that consciousness slips as time goes on.

Bryan Foss, a visiting professor at Bristol Business School and Fellow of the British Computer Society, finds: “Operational risks have often failed to get the executive and budgetary attention they deserve as boards may have been falsely assured that the risks fit within their risk appetite.” Another issue is that you can’t plan for when a disaster will happen, but you can plan to prevent it from causing the loss of service availability, financial or reputational damage.

To prevent damaging issues from arising Buchanan says organisations need to be able to provide support for end-to-end applications and services where availability is unaffected by disruptive events. When they do occur, the end user shouldn’t notice what’s going on – it should be transparent, according to Buchanan. “We saw what happened during Hurricane Sandy, and the data centredata centres in New York – they took a massive hit”, she says.  The October 2012 storm damaged a number of data centredata centres and took websites offline.

Backup, backup!

Traditionally, backing up is performed overnight when most users have logged off their organisation’s systems. “Now, in the days where we expect 24×7 usage and as the amount of data is every increasing the backup window is being squeezed more than ever before, and this has led to solutions being employed that depend on an organisation’s Recovery Point Objectives (RPO) and the Recovery Time Objectives (RTO)”, Buchanan explains.

“For some organisations such as financial services institutions, where these are ideally set at zero, synchronous replication is employed, and this suggests that the data copies are in the same data centre or the data centredata centres are located a few miles or kilometres from each other”, she adds. This is the standard way to minimise data retrieval times, and this what most people have done in the past because they are trying to support data synchronisation. Yet placing data centredata centres in the same circle of disruption can be disastrous whenever a flood, terrorist attack, power outage and so on occurs.

With other organisations an RTO and RPO of a few milliseconds is acceptable and so they can be placed further apart, but this replication doesn’t negate the need for backing up with modern technologies which allow machines to be backed up whilst they are still operational.

Comparing backups

Her colleague and CEO of Bridgeworks, David Trossell, adds that backup-as-service (BaaS) can help by reducing infrastructure-related capital investment costs. “It’s simple to deploy and you only pay for what you use, however, the checks and balances with BaaS needn’t be treated any differently from on-site backupbackups”, he explains. In other words when backup is installed within a data centre, performance is governed by the capability of the devices employed – such as tape or disks.  In contrast performance with BaaS is governed by its connection to a cloud service provider, and Trossell says this defines that speed at which data can be transferred to a cloud.

“A good and efficient method of moving data to the cloud is essential, but organisations should keep a backup copy of the data on-site as well as off-site and this principle applies to BaaS”, he advises.

Essentially, this means that a cloud service provider should secure the data in another region where the CSP operates. Also, in some circumstances it might be cheaper to bring the backup function in-house or with certain types of sensitive data a hybrid cloud approach might be more suitable.

Time is the ruler

Trossell says time is the ruler of all things, and he’s right. The challenge though is for organisations to be able to achieve more than 95% bandwidth utilisation from their networks. This is because of the way that the network protocol TCP/IP works. “Customers are using around 15% of their bandwidth, and some people try to run multiple streams which you have to be able to run down physical connections from the ingress to the egress in order to attain 95% utilisation”, reveals Buchanan.

For example, one Bridgeworks customer needed to backup 70TB of data using a 10GB WAN. The process took the customer 42 days to complete. “They were looking to replicate their entire environment which was going to cost up to £2m, and we put in our boxes within half an hour as a proof of concept”, she explains. Bridgeworks’ team restricted the bandwidth on the WAN network to 200MB, which resulted in the customer being able to complete an entire backup within just 7 days – achieving “80% expansion headroom on the connection and 75% on the number of days they clawed back”, she says. The customer has also since then been able to increase their data volumes.

Providing wider choice

“At the moment with outdated technology CEOs and decision-makers haven’t had the choice with regards to the distance between their data centredata centres without having to think about the impact of network latency, but WANrockIT is given the decision-maker the power to make a different choice to the one that has been historically made”, says Trossell. He claims that WANrockIT gives decision-makers freedom, good economics, a high level of frequency and it maximises the infrastructure in a way that means that their organisations don’t need to throw anything away.

Phil Taylor nevertheless concludes with some valid advice: “People need to be clear about their requirements and governing criteria because at the lowest level all data should be backed-up…, and business continuity must consider all operations of a business – not just IT systems”.

To ensure that a disaster recovery plan works, it has to be regularly tested. Time is of the essence, and so data backupbackups need to be exercised regularly with continuous availability in a way that maintenance doesn’t also prove to be disruptive. Testing will help to iron out any flaws in the process before disaster strikes.