All posts by chrisducker

A disaster recovery plan: What is your IT team keeping from you?

(c)iStock.com/Dimitrios Stefanidis

Your disaster recovery program is like a parachute – you don’t want to find yourself in freefall before you discover it won’t open. But amid hastening development cycles, and cost, resource and time pressures, many CIOs are failing to adequately prioritise DR planning and testing.

While IT teams are running to stand still with day-to-day responsibilities, DR efforts tend to be focused solely on infrastructure, hardware and software, neglecting the people and processes needed to execute the plan. At best, this runs the risk of failed recovery testing. At worst, a business may be brought to its knees at a time of actual disaster without any chance of a swift recovery.

Even if you passed your last DR test, it’s only a predictor of recovery, not a guarantee

Your team may be reluctant to flag areas of concern, or admit that they aren’t confident your DR plan will work in practice. Perhaps they’re relying on the belief that “disaster” is a statistically unlikely freak of nature (we all know hurricanes hardly ever happen in Hertford, Hereford and Hampshire) rather than a mundane but eminently more probable hardware failure or human error. It’s possible that at least one of these admissions may be left unspoken in your own organisation:

“We’re not confident of meeting our RTOs/RPOs”

Even if you passed your last annual DR test, it’s only a predictor of recovery, not a guarantee. Most testing takes place under managed conditions and takes months to plan, whereas in real life, outages strike without notice. Mission-critical applications have multiple dependencies that change frequently, so without ongoing tests, a recovery plan that worked only a few months ago might now fail to restore availability to a critical business application.

“Our DR plan only scratches the surface”

Many organisations overlook the impact of disruption on staff and the long-term availability of their data centres. How long you can support an outage at your recovery centre – whether that’s days or weeks – will determine your DR approach. Can you anticipate what you would do in a major disaster if you lost power, buildings or communication links? What if you can’t get the right people to the right places? How well is everyone informed of procedures and chains of command? People and processes are as relevant as technology when it comes to rigorous DR planning.

“We know how to fail over… just not how to fail back”

Failback – reinstating your production environment – can be the most disruptive element of a DR execution, because most processes have to be performed in reverse. Yet organisations often omit the process of testing their capabilities to recover back to the primary environment. When push comes to shove, failure to document and test this component of the DR plan could force a business to rely on its secondary site for longer than anticipated, adding significant costs and putting a strain on staff.

“Our runbooks are a little dusty”

How often do you evaluate and update your runbooks? Almost certainly not frequently enough. They should contain all the information your team needs to perform day-to-day operations and respond to emergency situations, including resource information about your primary data centre and its hardware and software, and step-by-step recovery procedures for operational processes. If this “bible” isn’t kept up to date and thoroughly scrutinised by key stakeholders, your recovery process is likely to stall, if not grind to a halt.

“Change management hasn’t changed”

Change is a constant of today’s highly dynamic production environments, in which applications can be deployed, storage provisioned and new systems set up with unprecedented speed. But the ease and frequency with which these changes are introduced means they’re not always reflected in your recovery site. The deciding factor in a successful recovery is whether you’ve stayed on top of formal day-to-day change management so that your secondary environment is in perfect sync with your live production environment.

“Our backup is one size fits all”

In today’s increasingly complex IT environments, not all applications and data are created equal. Many organisations default to backing up all their systems and both transactional and supportive records en masse, using the same method and frequency. Instead, applications and data should be prioritised according to business value: this allows each tier to be backed up on a different schedule to maximise efficiency and, during recovery, ensures that the most critical applications are restored soonest.

“Backing up isn’t moving us forward”

Backups are not, in isolation, a complete DR solution, but data management is a critical element of a successful recovery management plan. Whether you’re replicating to disk, tape or a blend of both, shuttling data between storage media is achingly slow. And if it takes forever to move and restore data, then regular testing becomes even less appealing. But foregoing a regular test restoration process simply because of time-to-restore concerns is a recipe for data loss in the event of an outage.

“We don’t have the bandwidth for testing”

Testing recovery procedures of applications is a whole other ballgame than recreating a data center from scratch. Trying to squeeze the whole exercise into a 72-hour testing window won’t do – that’s just enough time to marshal the right employees and ask them to participate in the test when it’s not part of their core function. So, companies often end up winging it with whatever resources they have on hand, rather than mapping out the people they need to conduct and validate a truly indicative test.

“We don’t want to do it…but we’re not keen on someone else doing it”

Trying to persuade employees that an outsource option for recovery is in their best interests can be like selling Christmas to turkeys. 

Foregoing a regular test restoration process simply because of time-to-restore concerns is a recipe for data loss

But in fact, partnering with a recovery service provider actively complements in-house skills-sets by allowing your people to focus on projects that move your business forward rather than operational tasks. It is also proven to boost overall recoverability. Managed recovery doesn’t have to be an all-or-nothing proposition, either, but a considered and congruous division of responsibilities.

With always-on availability becoming a competitive differentiator, as well as an operational must-have, you don’t have the luxury of trusting to luck that your DR plans will truly hold up in the event of a disaster.

The first step to recovery starts with admitting you have a problem and asking for help.

Read more: Will your current DR plans truly hold up in the event of a disaster?

Are the costs of cloud implementations overruling the benefits?

(c)iStock.com/livecal

Cloud computing has unquestionably had a dynamic effect on business growth and productivity, enabling mobile collaboration and scalable packages that allow for rapid expansion. One of the principal benefits given in favour of this model of computing is its ability to drive down costs, particularly when it comes to removing or reducing the need for upfront investment, and providing a stable, predictable IT subscription.

One might assume, then, that the global take-up of cloud solutions would drive costs down further even as efficiencies grow. Yet often the opposite is true, and many businesses are encountering ‘hidden’ costs they hadn’t anticipated, causing some to question the overall value of cloud migration.

Unexpected costs

While cloud service providers naturally advertise the potential IT cost-savings to be made, caution is necessary when taking into account various factors.

The first is a question of human resources. In-house IT staff will need ongoing training in order to efficiently manage the cloud architecture, from the first stages of migration to integrating systems and maintaining them. This cost is enhanced the more complex hybrid systems become, as your staff need to collaborate extensively with the service provider’s administrators.

With regard to data retrieval, some cloud service providers offer tiered storage options, with differing levels of accessibility, which allow businesses to compartmentalise their data according to how often they need to retrieve it. This means that if you should unexpectedly need access to critical data promptly, you could end up paying more than you had initially budgeted for storage.

It can also be easy to either over- or under-provision, increasing costs, until you find the solution that works best for you, while pricing models are known to change regularly, in part due to service providers finding that increasing demand or complexity of their offering is raising their own costs.

While by no means comprehensive, this list should throw some light on what is a pressing concern for businesses investing in cloud migration. Part of the issue is that many businesses still view cloud computing solely as a cost, rather than as a benefit. Even though they may be paying more than they anticipated, the importance of reliable service, increased productivity and IT agility that they gain in return cannot be overstated.

Operational costs will still tend to be substantially lower than hosting software in-house, and there are numerous advantages, just one of which is an efficiency gain in the instant release of upgraded software and application of patches.

Cost reduction strategies

The good news is there are various strategies available to lower unexpected or ‘hidden’ costs of cloud computing.

Planning is key. The more time spent before migration into the cloud determining exactly what your company’s requirements and internal resources are, the more successful that migration is likely to be in its early stages. When looking at budget, it’s advisable to factor in the costs of making a few mistakes, such as storage needs, at the beginning. Never rush into a decision when selecting a provider as if you make the wrong choice, it can be time-consuming and costly to move your data elsewhere.

Staggering the migration process is also a good idea; consider moving easier workloads across first to build up knowledge, experience and confidence among your IT team, while equally important is investing in your in-house staff so that they are optimised. It’s worth noting here that you should expect some cost in relation to this to be permanent; while cloud computing in itself is not inherently expensive, the ‘human cost’ of ongoing training, support and maintenance can be.

It is also in the best interests of your service provider to ensure their clients are happy, so if you have issues with costs, you can take them up with your account manager and try to work out a more suitable plan going forwards.

Conclusion

There is a reason why most opinion towards cloud computing remains positive, as it does usually make financial sense for businesses of all sizes. It’s therefore vital to have a good understanding of what you’re paying for and why.