“I know what some of you are thinking: how could this happen if we have multiple copies of your data in multiple data centres?” wrote Ben Treynor, Google’s VP of engineering. It was 2011, and Treynor was using a Google blog to address an embarrassing glitch. In the process of updating its storage software, the company had wiped 0.02% of its users’ email. That doesn’t sound like much – until you consider how many inboxes Gmail hosts.
The solution was more complex than merely switching to another data centre. As Treynor explained, “to protect your information from these unusual bugs, we also back it up to tape. Since the tapes are offline, they’re protected from such software bugs, but restoring data from them takes longer than transferring your requests to another data centre, which is why it’s taken us hours to get the email back instead of milliseconds.” All told, the restore operation took the best part of four days.
Seven years later, tape remains a mainstay of even the biggest cloud providers’ contingency plans. Should it also be part of yours?
Tape vs cloud
“I’m not saying that tape is always the right place to store all data all the time, [but] it’s the right place to store large amounts of data that you don’t need all the time for long periods of time,” explained Overland-Tandberg’s tape product line director, Peri Grover. “Mission-critical transactional stuff that you need back in a matter of seconds – of course you want that on disk or some other solution, and a lot of people are talking about the cloud.”
She readily agrees that there are advantages to cloud storage, but points to the impracticality of moving large amounts of data between sites. “Amazon Web Services may look inexpensive on the surface,” said Grover, “but when people start checking into it and find out that if they’ve got a 10TB file they need a 40Gbit [connection] to get it across in seconds, [they realise] that’s going to cost them hundreds of thousands of dollars.”
The transfer rates you get with tape could mean a faster recovery
Eric Bassier, senior director for product management at Quantum, also cites cloud’s relatively low speed as a reason for sticking with tape. With a service such as Amazon Glacier, he said, “its published SOA is hours, and that doesn’t count the time to transmit the data from the data centre to the local application. Your SOA from tape is minutes, versus days from the cloud. If you have tens or hundreds of terabytes of data to retrieve in a true disaster recovery scenario, it can be weeks.”
This is enough to make one of cloud’s key benefits – global availability, 24/7 – a moot point. “With tape, you have got an impressive transfer rate once you’ve got to the first byte of data,” said Carlos Sandoval Castro, worldwide offering manager for tape storage at IBM. “For many companies, it’s much easier to just ship a bunch of cartridges to whatever location so they can get their data back, rather than wait to retrieve it from the cloud.”
Cost analysis
It’s a common assumption that cloud is cheap, with providers routinely demanding mere cents for each gigabyte. Surely tape can’t compete.
Actually, it can, said Bassier. “We’re seeing companies starting to bring datasets and workloads back from the cloud because they’re realising it’s too expensive. When they look at tape or cloud as alternatives for long-term storage, a key factor is the cost of retrieval, which is free using on-premise tape, but can be expensive from the cloud.”
The same is true of disk. “Clifford Group compared the cost of tape-only and disk-only archiving solutions,” Grover explained. “When it took into account cooling and footprint, the cost of the hardware, maintenance, media… the disk solution was six times more expensive than tape, at $15 million versus just $2.4 million.”
Storing your storage media
While it’s starting to hit its limits where areal capacity is concerned, tape still has some way to go – and a multi-year roadmap to prove it. New standards, which are currently under development, are touching capacities of 185TB per cartridge. They’re not yet commercially available, but they have been in existence, in the lab, since 2014.
Increasing tape’s areal density has other benefits. It’s not only the price per gigabyte that’s falling as more data is written to each cartridge; cost of storing the media is falling in sync.
“An LTO 1 cartridge held 200GB compressed,” said Laura Loredo, senior product manager for Hewlett Packard Enterprise, “but today’s cartridges are 30TB compressed, so people migrate for the benefits of consolidating their data.” She cites freeing up slots in their libraries as one such benefit.
Still worried about the longevity of tape? Each cartridge has a stated working life of three decades
This makes the cartridges stated working life – three decades apiece – somewhat moot but, as Grover explained, that metric is more a means of quantifying their durability than an indication of how they’re used in the real world.
“Tape technology originally developed from consumer cassettes, which had a reputation for not being very robust,” Grover said. “People would leave them on the dash of their car and they’d melt. So, that’s where the 30-year claim came from. The industry had a longevity robustness perception problem to overcome and once LTO came along, all that baggage associated with tape was logically gone but not emotionally gone in some of the old-timer IT guys’ mindsets.”
Security built-in
The removable nature of tape means it’s uniquely protected against security threats that afflict otherwise live storage media. As Grover explains, “it isn’t subject to things like malware, viruses or ransomware the way disk is, simply by nature of the media’s physicality”.
Hewlett Packard’s Loredo agrees:
“Tape provides an air gap because it’s offline. If your primary disks get attacked with ransomware, you can go to your tapes and recover your data from there without paying the ransom. It’s the only media that provides proper protection against malware and ransomware.”
But not paying the ransom isn’t the only saving you’d make in such a situation – or in a natural or man-made disaster scenario.
“Downtime can cost as much as $3 million an hour if you’re a credit card company,” said Grover. “You can’t risk that. All you have to do is talk to an IT guy about being down or not being able to get your data back – that’s a hard thing to put a number around. As an IT guy, you don’t lose your job for not backing up the data; you lose it for not being able to get it back.”
At this point, the discussion closes a loop. Keeping the data close to home makes it immediately accessible, which reduces the duration – and thus cost – of any downtime while minimising ongoing overheads.
However, there are other reasons why you might favour tape over online storage.
“Once you start putting your data into a provider’s service, you’re locked in,” said Loredo. “You’ll have to get it all out again before you can move [to another provider]. And when you send your data to a third party you’ve lost control of it. Do you know how secure it is there? With GDPR, you should be concerned about that.”
Grover has similar sentiments. “[Data is] your company’s most valuable asset, and you want to trust that to somebody else? If you want to do it in parallel, that’s great, but they have no interest in your company, don’t reside on any of your campuses… no matter how much you pay them, they’re the ones who have control over when you get your data back and how it gets stored. Are you really good with that?”