All posts by richardstinton

Disaster recovery: The importance of choosing the right provider for you

I wrote an article recently which centred on Gartner’s prediction that the disaster recovery as a service (DRaaS) market would grow from $2.01bn in 2017 to $3.7bn by 2021.

In my opinion, one of the main drivers for this rapid level of growth is the fact that it is ‘as a service’ and not the complex and expensive ‘create your own’ environment that it used to be. As a result, this has made DRaaS much more accessible to the SMB market, as well as enterprise customers. But, as the list of DRaaS solutions grows along with adoption rates, it's important for customers to carefully consider how their choice of cloud provider should be influenced by their existing infrastructure. This will help to avoid technical challenges down the road.

The concept of disaster recovery

Before I delve into the key considerations for customers when choosing a DR solution, I should, for the sake of the uninitiated amongst us, explain what DR is. It literally means to recover from a disaster, and so encompasses the time and labour required to be up and running again after a data loss or downtime. DR depends on the solution that is chosen to protect the business against data loss. It is not simply about the time during which systems and employees cannot work. It is also about the amount of data lost when having to fall back on a previous version of that data. Businesses should always ask themselves: “how much would an hour of downtime cost?” And, moreover, “is it possible to remember and reproduce the work that employees, or systems did in the last few hours?”

When choosing a DR solution, what are the considerations?

In the past, customers would usually have resorted to building out a secondary data centre complete with a suitably sized stack of infrastructure to support their key production servers in the event of a DR undertaking. They could either build with new infrastructure, or eke out a few more years from older servers and networking equipment. Often, they would even buy similar storage technology that would support replication.

More recently, software-based replication technologies have enabled a more heterogeneous set up, but still requiring a significant investment in the secondary data centre, and, not forgetting the power and cooling required in the secondary DC, coupled with the ongoing maintenance of the hardware, all of which increases the overall cost and management task of the DR strategy.

Even recent announcements such as VMware Cloud on AWS, are effectively managed co-location offerings, involving a large financial commitment to physical servers and storage which will be running 24/7.

So, should customers be looking to develop their own DR solutions, or would it be easier and more cost-effective to buy a service offering?

Enter DRaaS. Now, customers need only pay for the storage associated with their virtual machines being replicated and protected, and only pay for CPU and RAM when there is a DR test or real failover.

Choosing the right DR provider for you

When determining the right DR provider for you, I would always recommend undertaking a disaster recovery requirements checklist and regardless of whether you are choosing an in-house or DRaaS solution. This checklist should include the following points:

Performance

  • Does the DR solution offer continuous replication?
  • Which RTO and RPO does the solution offer?
  • DRaaS – Does the Cloud Service Provider offer a reliable and fast networking solution, and does the DRaaS solution offer networking efficiencies like compression?

Support of your systems

  • Is the DR solution storage agnostic?
  • How scalable is the solution (up and also down in a DRaaS environment)?
  • DRaaS – Does it offer securely isolated data streams for business critical applications and compliance?

Functionality

  • Is it a complete off-site protection solution, offering both DR and archival (backup) storage?
  • Is it suited for both hardware and logical failures?
  • Does it offer sufficient failover and failback functionality?

Compliance

  • Can it be tested easily and are testing reports available?
  • DRaaS – Are there any licence issues or other investments upfront?
  • DRaaS – Where is the data being kept? Does the service provider comply with EU regulations?

Let’s take VMware customers as an example. What are the benefits for VMware on-premises customers to working with a VMware-based DRaaS service provider?

Clearly, one of the main benefits is that the VMs will not need to be converted to a different hypervisor platform such as Hyper-V, KVM or Xen. This can cause problems as VMware tools will need to be removed (deleting any drivers) and the equivalent tools installed for the new hypervisor. Network Interface Controllers (NICs) will be deleted and new ones will need to be configured. This results in significantly longer on-boarding times as well as ongoing DR management challenges; these factors increase the overall TCO of the DRaaS solution.

In the case of the hyperscale cloud providers, there is also the need to align VM configuration to the nearest instance of CPU, RAM and storage that those providers support. If you have several virtual disks, this may mean that you need more CPU and RAM in order to allow more disks (the number of disks is usually a function of the number of CPU cores). Again, this can significantly drive up the cost of your DRaaS solution.

In some hyperscale cloud providers, the performance of the virtual disks is limited to a certain number of IOPS. For typical VMware VM implementations, with a C: drive and a data disk or two, this can result in very slow performance.

Over the past few years, iland has developed a highly functional web-based console, that gives DRaaS customers the same VMware functionality that they used to on-premises. This allows them to launch remote consoles, reconfigure VMs, see detailed performance data, take snapshots while running in DR and, importantly, perform test failovers in addition to other functions.

For VMware customers, leveraging a VMware-based cloud provider for Disaster Recovery as a Service delivers rapid on-boarding, cost-effectiveness, ease of ongoing management and a more flexible and reliable solution to protect your business.

Why cloud storage, DRaaS, multi-cloud and data security will all be key cloud drivers in 2018

It's that time of year when industry commentators are weighing in with their predictions and projections for the year ahead.

While the subject of cloud computing is a big topic, probably one of the most pressing subjects hitting the headlines in 2018 is increasing regulations relating to GDPR.  However, there are a few other cloud-related topics that I would like to put the spotlight on as we look at the anticipated growth areas for cloud service providers in the year ahead. In particular, I’d like to focus on cloud storage, DRaaS, multi-cloud, and data security.

Growth of cloud storage

Cisco estimates that the total cloud storage market will increase from 370EB in 2017 to 1.1ZB in 2018 which reinforces that this will be a particular growth area for cloud service providers. Increased regulation has driven requirements for several copies of backup data – on-premises, off-site or in the cloud and in certain industries, legislation requires longer term retention of data, often up to 10 years.

According to a 2017 Gartner survey, 42% of respondents said they would be looking to implement cloud backup in the next year, while 13% said they were already doing so. Increased availability of high-speed fibre broadband, as well as FTTP and MPLS circuits, means backup to the cloud has become much more accessible for small to medium sized businesses.

Over the last year, we have seen massive growth in the take-up of cloud backup offerings. Cloud backup is probably one of the easiest cloud services to test and adopt. For example, it takes only a few clicks within the Veeam Backup and Recovery console to add iland as a service provider and start sending backup or copy jobs to the cloud.

Growth of disaster recovery as a service

Statistics from Gartner indicate that the DRaaS market is set to grow from $2.01B in 2017 to $3.7B by 2021.  The fact that 2017 has seen a great deal of natural disasters around the world, from hurricanes and floods to wild fires, has exacerbated this. As a result we have seen customers rushing to buy DRaaS services, and existing customers invoke their DRaaS for real. One organisation in Florida was able to go from having no disaster recovery to having a fully replicated and tested solution within five days as Hurricane Irma swept in.

Aside from natural disasters, the rise of ransomware has been another important driver for DRaaS. The very low RPOs often make DRaaS a better solution than backing up and recovering data on a daily basis. As with cloud backup, the increased availability of high-speed fibre broadband has made DRaaS replication across the internet much more achievable for most customers.

Multi-cloud strategies are taking off

It’s hard to deny the massive shift that is taking place among businesses in favour of multiple cloud environments, including public and private clouds, as well as on-premises infrastructure. As businesses deploy new applications and move critical workloads to save money and boost agility, it's safe to say that the trend of mixing and matching cloud environments will only accelerate.

According to Gartner, the IaaS market grew 31.4% in 2016. While the hyper-scale providers accounted for the lion’s share of this figure, others in the market saw a 13.2% growth. 451 Research predicts that IaaS will continue to grow from an estimated $16B in 2017 to $30B in 2021.

Cloud lock-in is seen as an issue with many hyper-scale cloud service providers. There is concern that many businesses lack contingency plans should they wish to switch from one provider to another, likewise they may want to stagger the risk and use more than one cloud provider. For example, in heavily regulated industries organisations are strongly advised not to put all of their eggs in one basket.

GDPR compliance and security

For many years security was seen as a hindrance to cloud adoption. Now, in most cases, security is covered by the cloud provider and their vendor partners.

GDPR has created increased requirements for security and compliance around data ownership, access, and deletion, and, importantly, who is responsible for the data.

From the outset, the iland secure cloud has been built to provide all the aspects of security and compliance that an enterprise customer would require. This includes Trend Micro Deep Security to protect the virtual machines running in the customer's virtual data centres, as well as Tenable Nessus to monitor and protect VMs exposed to the internet.

From a compliance perspective, iland has a dedicated team of professionals to ensure that we are at the forefront of compliance initiatives such as ISO 27001, CSA Star, SOC, HIPAA, PCI and GCloud.

GDPR will bring in a whole set of new requirements around data privacy, and iland is constantly improving processes and procedures, as well as offering services to enable customers to understand their commitments around data protection.

As a cloud service provider, we continue to invest in our DRaaS offering to help businesses prepare for natural disasters, ransomware attacks, and other potential threats to data. We have also seen increased usage of our cloud backup offering, based on Veeam Cloud Connect. Understanding that a multi-cloud solution is something that businesses will increasingly seek, we aim to help our customers diversify their cloud strategy in the year ahead.

Are availability zones a disaster recovery solution?

I recently read an article which began “you can’t predict a disaster, but you can be prepared for one.” It got me thinking. I can hardly remember a time when disaster recovery was a bigger challenge for infrastructure managers than it is today. In fact, with ever increasing threats to IT systems, a reliable disaster recovery strategy is now absolutely essential for an organisation, regardless of their vertical market.

What does all this have to do with availability zones, I hear you cry? Furthermore, what is an availability zone and is it a good disaster recovery strategy? The purpose of availability zones is to provide better availability while protecting against failure of the underlying platform (the hypervisor, physical server, network, and storage). They give customers more options in the event of a localised data centre fault. Availability zones can also allow customers to use cloud services in two regions simultaneously if these regions are in the same geographic area.

Let us begin our discussion about availability zones by looking at the core capabilities that provide availability and resilience. Dynamic Resource Schedulers (DRS) provide Virtual Machine (VM) placement. That is, which host should run a given VM? A DRS also moves VMs around a cluster based on usage in order to balance out the cluster. High Availability (HA) provides the capability to restart VMs on other hosts in a cluster when either a host fails, or a VM crashes for any reason.

Now, let us look at the advantages that availability zones offer, as well as areas where they may fall short of constituting an effective disaster recovery strategy. This analysis of availability zone effectiveness will be divided based on three key challenges that cloud providers face: handling crashes or downtime, performing maintenance, and offering sufficient storage.

Crashes or downtime

It is not unusual for a cloud provider to only offer HA and not DRS. In this case, in the event of a host hypervisor crash or deliberate shutdown, VMs are restarted on other hosts because they have shared storage. This is done using initial placement calculation. However, providers often do not have the ability to move a running VM between hosts in a cluster with no loss of service, and to incorporate such a DRS capability would strengthen disaster recovery preparedness.

Maintenance 

There is also a problem with this model around planned maintenance. When hosts are updated, it is not possible to move the VMs that are running on them without loss of service. Therefore, VMs occasionally have the rug pulled out from underneath them.

With this in mind, many service providers talk about a ‘Design for Failure’ model when designing resilient services. In a nutshell, this means designing cloud infrastructure on the premise that parts of it will inevitably fail. Resiliency is provided at the application level. At the very least, this requires the doubling up of all applications, and for many deployments this necessitates additional licensing and additional costs for the VMs themselves.

Storage

Another crucial area to factor into this analysis is persistent storage. In the past, storage was protected using RAID techniques. Yet as we move to the public cloud, object storage has appeared as a popular way of storing data. This method uses the availability zone topology to protect data — but only if you choose it and pay for it. To protect against individual disk failure, three copies of the data are spread across the storage subsystems.

For virtual machines requiring persistent storage, Elastic block storage (EBS) is often used, and is replicated within the availability zone to protect against failure of the underlying storage platform.

EBS storage is not always replicated to other regions. 

Regardless, having data replicated to another region does not mean that the VMs are available there. It only guarantees back-up storage. VMs would need to be created from the underlying replicated storage.  It is also important to note that replicating storage to another availability zone or region only protects against storage subsystem failure. It does not protect against storage corruption, accidental deletion, or recent threats such as ransomware encrypting the files within the storage. To that extent, it is not creating a Disaster Recovery solution.

So, we return to our original question: can availability zones theoretically offer the resiliency needed for a good disaster recovery strategy? In the event of a crash, Dynamic Resource Schedulers can be used to move a VM between hosts in a cluster with no loss of service. However, when hosts are being updated, it is very difficult to move the VMs that are running on them without loss of service. As we have just discussed, redundant storage does not guarantee VM availability in other regions. Most importantly, these capabilities do not protect against data corruption or threats such as ransomware that encrypt data. Given this, a disaster recovery solution should be implemented in addition to the use of availability zones.

Cloud-to-cloud disaster recovery as a service (DRaaS) can be adopted between data centres. With the iland DRaaS solution, VMs can be rebooted within seconds in the event of a crash or downtime. iland DRaaS also offers a continuous replication solution with a journal supporting up to 30 days. This means that you can recover data if it is lost or corrupted; for example, you can recover data from a ransomware attack. Self-service testing can also be carried out whenever required, while replication carries on in the background. As customers think about migrating their traditional virtualised services to the public cloud, they need to consider crashes, maintenance, storage, and also a disaster recovery strategy.

How virtualisation is a vital stepping stone to the cloud

It is true to say that there has been a lot of talk about virtualisation over the years and you would be forgiven for thinking that every server and every storage system has had some kind of virtualisation treatment, but there are still some companies out there who have yet to virtualise and indeed realise the benefits that both virtualisation and cloud offer.

So for those who have not gone down the virtualisation route let me summarise the benefits from a business and technical perspective and whilst we are there let’s take a look at virtualisation in the context of cloud adoption and the benefits it brings in a cloud environment.   The need to embrace virtualisation normally comes hand in hand with growing hybrid environments and having some of your IT infrastructure in the cloud and some on premise.  In fact virtualisation is often seen as a stepping stone into cloud.

Why virtualise?

Let’s start by looking at the benefits that virtualisation brings to the business. Virtualisation can increase IT agility, flexibility, and scalability while creating significant cost savings. Workloads get deployed faster, performance and availability increases and operations become automated.  This results in an IT environment that is simpler to manage and less costly to own and operate.  Other key business benefits of virtualisation include the ability for the organisation to reduce capital and operating costs, to more effectively minimise or even eliminate downtime.  IT can provision applications and resources faster, enable business continuity and disaster recovery, and finally virtualisation helps to simplify data centre management.

The key technical benefits of virtualisation are many and include:

  • Encapsulation – the virtual machine is described as a small number of files, typically its virtual disk drive and a resource description file contains its virtual hardware requirements
  • Virtual machines separate the operating system from the physical hardware, so they are no longer tightly locked to particular hardware through device drivers. This makes moving a VM from a Dell server, say, to an HP or Lenovo server much more straightforward
  • Live migration – a virtual machine can be moved from one physical host to another with no downtime for the operating system. This is very useful for both load balancing, but also for maintenance or upgrading of hosts

Best practice tools and approaches

Tools are available to virtualise current physical servers, such as VMware Converter and Platespin Powerconvert.  However, often it is better to build a virtual machine from new and reload the application and data.  As organisations choose to upgrade their operating systems and applications, they are normally doing this from a fresh build of Windows or Linux as a virtual machine.  And in terms of best practices for virtualising servers, all Intel x86 servers are now candidates for virtualisation.

The main challenges around virtualisation are normally about the risk associated with the migration process, especially downtime, and around the licensing policies of the software being run.  For this reason, the main servers still running on physical hardware tend to be large transactional databases such as SQL server clusters and Oracle.

To get the greatest benefits out of virtualisation it should be in a shared storage infrastructure.  In the early days, this meant fibre-channel SAN or network-attached (NFS) storage.  Recently, hyper-converged infrastructures have appeared on the scene from companies such as VMware (VSAN), Nutanix and Simplivity. This takes industry-standard x86 servers with large amounts of CPU and RAM, but it uses internal disks rather than a SAN, together with high-speed networking.  Technologies such as flash and solid-state disks, as well as compression and deduplication have made these storage systems cheaper and faster for virtualised workloads.  Software ensures that all the data is fully redundant across the cluster, with virtual machines load balanced at the same time.

Virtualising servers and storage in preparation for creating a private cloud

Virtualisation is a key requirement for the cloud, whether migrating or creating from new. Once virtualised, virtual machines can be migrated to cloud providers via export/import mechanisms such as the Open Virtualisation Format (OVF).  In some cases the format of the virtual machine will need to be changed depending on the source and target format.  For example, VMware to MS Azure will require conversion to Hyper-V, while Amazon AWS would require conversion to Xen. 

Rather than migrating and converting, it is often better to replicate the data over a period of time using solutions such as Zerto, Veeam or DoubleTake.  This will result in a much shorter downtime required to switch over from running on-premises to a cloud provider. As an established cloud service provider, iland helps customers to migrate their physical and virtual servers to the public or private cloud.

If this is all starting to sound too good to be true let me reassure you, there are very few pitfalls. Sure, in the early days there was a slight performance overhead associated with virtualisation when compared to the raw physical server, but these days server and storage technology have all but removed that.

Future developments

Virtualisation technology has matured over the past few years. The new developments are really around automation and the final virtualisation of networking which together allow for what is often termed the Software Defined Data Centre. In this world everything from servers, storage and networking can be defined and controlled through software. The utopia is where virtual machines can be moved and managed using common networks wherever they happen to be.

Virtualisation offers a host of benefits and very few challenges as the market matures.  That said without right-sizing VMs and failing to track and anticipate resource needs, businesses could face issues such as VM sprawl and over consumption, so it does need to be tightly managed.  My advice is that businesses should plan for the long-term to make sure that they have enough resources on hand to meet future business demands.