Category Archives: High Availability

Catching up with Chuck Hollis: A Storage Discussion

Things are moving fast in the IT world. Recently, we caught up with Chuck Hollis (EMC’s Global Marketing CTO and popular industry blogger) to discuss a variety of topics including datacenter federation, Solid State Drives, and misperceptions surrounding cloud storage.

JTC: Let’s start off with Datacenter federation…what is coming down the road for running active/active datacenters with both HA and DR?

Chuck: I suppose the first thing that’s worth pointing out is that we’re starting to see using multiple data centers as an opportunity, as opposed to some sort of problem to overcome. Five years ago, it seems that everyone wanted to collapse into one or two data centers. Now, it’s pretty clear that the pendulum is starting to move in the other direction – using a number of smaller locations that are geographically dispersed.

The motivations are pretty clear as well: separation gives you additional protection, for certain applications users get better experiences when they’re close to their data, and so on. And, of course, there are so many options these days for hosting, managed private cloud services and the like. No need to own all your data centers anymore!

As a result, we want to think of our “pool of resources” as not just the stuff sitting in a single data center, but the stuff in all of our locations. We want to load balance, we want to failover, we want to recover from a disaster and so on – and not require separate technology stacks.

We’re now at a point where the technologies are coming together nicely to do just that. In the EMC world, that would be products like VPLEX and RecoverPoint, tightly integrated with VMware from an operations perspective. I’m impressed that we have a non-trivial number of customers that are routinely doing live migrations at metro distances using VPLEX or testing their failover capabilities (not-disruptively and at a distance) using RecoverPoint.

The costs are coming down, the simplicity and integration is moving up – meaning that these environments are far easier to justify, deploy and manage than just a few years ago. Before long, I think we’ll see active-active data centers as sort of an expected norm vs. an exception.

JTC: How is SSD being leveraged in total data solutions now, with the rollout of the various ExtremeIO products?

Chuck: Well, I think most people realize we’re in the midst of a rather substantial storage technology shift. Flash (in all its forms) is now preferred for performance, disks for capacity.

The first wave of flash adoption was combining flash and disk inside the array (using intelligent software), usually dubbed a “hybrid array”. These have proven to be very, very popular: with the right software, a little bit of flash in your array can result in an eye-popping performance boost and be far more cost effective than trying to use only physical disks to do so. In the EMC portfolio, this would be FAST on either a VNX or VMAX. The approach has proven so popular that most modern storage arrays have at least some sort of ability to mix flash and disk.

The second wave is upon us now: putting flash cards directly into the server to deliver even more cost-effective performance. With this approach, storage is accessed at bus speed, not network speed – so once again you get an incredible boost in performance, even as compared to the hybrid arrays. Keep in mind, though: today this server-based flash storage is primarily used as a cache, and not as persistent and resilient storage – there’s still a need for external arrays in most situations. In the EMC portfolio, that would be the XtremSF hardware and XxtremSW software – again, very popular with the performance-focused crowd.

The third wave will get underway later this year: all-flash array designs that leave behind the need to support spinning disks. Without dragging you through the details, if you design an array to support flash and only flash, you can do some pretty impactful things in terms of performance, functionality, cost-effectiveness and the like. I think the most exciting example right now is the XtremIO array which we’ve started to deliver to customers. Performance-wise, it spans the gap between hybrid arrays and server flash, delivering predictable performance largely regardless of how you’re accessing the data. You can turn on all the bells and whistles (snaps, etc.) and run them at full-bore. And data deduplication is assumed to be on all the time, making the economics a lot more approachable.

The good news: it’s pretty clear that the industry is moving to flash. The challenging part? Working with customers hand-in-hand to figure out how to get there in a logical and justifiable fashion. And that’s where I think strong partners like GreenPages can really help.

JTC: How do those new products tie into FAST on the array side, with software on the hosts, SSD cards for the servers and SSD arrays?

Chuck: Well, at one level, it’s important that the arrays know about the server-side flash, and vice-versa.

Let’s start with something simple like management: you want to get a single picture of how everything is connected – something we’ve put in our management products like Unisphere. Going farther, the server flash should know when to write persistent data to the array and not keep it locally – that’s what XtremSW does among other things. The array, in turn, shouldn’t be trying to cache data that’s already being cached by the server-side flash – that would be wasteful.

Another way of looking at it is that the new “storage stack” extends beyond the array, across the network and into the server itself. The software algorithms have to know this. The configuration and management tools have to know this. As a result, the storage team and the server team have to work together in new ways. Again, working with a partner that understands these issues is very, very helpful.

JTC: What’ the biggest misperception about cloud storage right now?

Chuck: Anytime you use the word “cloud,” you’re opening yourself up for all sorts of misconceptions, and cloud storage is no exception. The only reasonable way to talk about the subject is by looking at different use cases vs. attempting to establish what I believe is a non-existent category.

Here’s an example: we’ve got many customers who’ve decided to use an external service for longer-term data archiving: you know, the stuff you can’t throw away, but nobody is expected to use. They get this data out of their environment by handing it off to a service provider, and then take the bill and pass it on directly to the users who are demanding the service. From my perspective, that’s a win-win for everyone involved.

Can you call that “cloud storage”? Perhaps.

Or, more recently, let’s take Syncplicity, EMC’s product for enterprise sync-and-share. There are two options for where the user data sits: either an external cloud storage service, or an internal one based on Atmos or Isilon. Both are very specific examples of “cloud storage,” but the decision as to whether you do it internally or externally is driven by security policy, costs and a bunch of other factors.

Other examples include global enterprises that need to move content around the globe, or perhaps someone who wants to stash a safety copy of their backups at a remote location. Are these “cloud storage?”

So, to answer your question more directly, I think the biggest misconception is that – without talking about very specific use cases – we sort of devolve into a hand-waving and philosophy exercise. Is cloud a technology and operational model, or is it simply a convenient consumption model?

The technologies and operational models are identical for everyone, whether you do it yourself or purchase it as a service from an external provider.

JTC: Talk about Big Data and how EMC solutions are addressing that market (Isilon, GreenPlum, what else?).

Chuck: If you thought that “cloud” caused misperceptions, it’s even worse for “big data.” I try to break it down into the macro and the micro.

At the macro level, information is becoming the new wealth. Instead of it being just an adjunct to the business process, it *is* the business process. The more information that can be harnessed, the better your process can be. That leads us to a discussion around big data analytics, which is shaping up to be the “killer app” for the next decade. Business people are starting to realize that building better predictive models can fundamentally change how they do business, and now the race is on. Talk to anyone in healthcare, financial services, retail, etc. – the IT investment pattern has clearly started to shift as a result.

From an IT perspective, the existing challenges can get much, much more challenging. Any big data app is the new 800 pound gorilla, and you’re going to have a zoo-full of them. It’s not unusual to see a 10x or 100x spike in the demand for storage resources when this happens. All of the sudden, you start looking for new scale-out storage technologies (like Isilon, for example) and better ways to manage things. Whatever you were doing for the last few years won’t work at all going forward.

There’s a new software stack in play: think Hadoop, HDFS, a slew of analytical tools, collaborative environments – and an entirely new class of production-grade predictive analytics applications that get created. That’s why EMC and VMware formed Pivotal from existing assets like Greenplum, GemFire et. al. – there was nothing in the market that addressed this new need, and did it in a cloud-agnostic manner.

Finally, we have to keep in mind that the business wants “big answers”, and not “big data.” There’s a serious organizational journey involved in building these environments, extracting new insights, and operationalizing the results. Most customers need outside help to get there faster, and we see our partner community starting to respond in kind.

If you’d like a historical perspective, think back to where the internet was in 1995. It was new, it was exotic, and we all wondered how things would change as a result. It’s now 2013, and we’re looking at big data as a potentially more impactful example. We all can see the amazing power; how do we put it to work in our respective organizations?

Exciting time indeed ….

Chuck is the Global Marketing CTO at EMC. You can read more from Chuck on his blog and follow him on Twitter at @chuckhollis.