(c)iStock.com/4x-image
In today’s data-intensive, hyper-connected world, storage solutions built on the vertical scaling model have become impractical and expensive. Enterprises are driven by the need to remain competitive while storing and managing petabytes of data. In this vein, ESG senior analyst Mark Peters believes that there is at last a straight line between how storage is configured and business value – if organisations can get it right. Vertical scaling is a legacy approach that cannot provide the performance or cost-effectiveness organisations need today, but the adoption of software- defined storage is now enabling data centres to scale competitively.
Another development assists in this goal. Hybrid cloud offers a way for organisations to gain the maximum amount of business flexibility from cloud architectures, which helps maximise budget efficiency and performance goals at the same time. Many storage professionals are still in the learning curve of hybrid cloud architectures because they are so new. They are just beginning to grasp the benefits and challenges associated with deploying a hybrid cloud approach.
This article will offer design elements to present to your customers so that their hybrid clouds can deliver the performance, flexibility and scalability they need.
How scale-out NAS factors in
The linchpin that will make this hybrid cloud storage solution possible is a scale-out NAS. Since hybrid cloud architectures are relatively new to the market—and even newer in full-scale deployment—many organisations are unaware of the importance of consistency in a scale-out NAS (network attached storage).
Many environments are eventually consistent, meaning that files written to one node are not immediately accessible from other nodes. This can be caused by not having a proper implementation of the protocols, or not tight enough integration with the virtual file system. The opposite of that is being strictly consistent: files are accessible from all nodes at the same time. Compliant protocol implementations and tight integration with the virtual file system is a good recipe for success.
In an ideal set-up, a scale-out NAS hybrid cloud architecture will be based on three layers. Each server in the cluster will run a software stack based on these layers.
- The persistent storage layer is layer one. It is based on an object store, which provides advantages like extreme scalability. However, the layer must be strictly consistent in itself.
- Layer two is the virtual file system, which is the core of any scale-out NAS. It is in this layer that features like caching, locking, tiering, quota and snapshots are handled.
- Layer three holds the protocols like SMB and NFS but also integration points for hypervisors, for example.
It is very important to keep the architecture symmetrical and clean. If organisations manage to do that, many future architectural challenges will be much easier to solve.
The storage layer requires closer examination now. Because it is based on an object store, we can now easily scale our storage solution. With a clean and symmetrical architecture, we can reach exabytes of data and trillions of files.
The storage layer is responsible for ensuring redundancy, so a fast and effective self-healing mechanism is needed. To keep the data footprint low in the data centre, the storage layer needs to support different file encodings. Some are good for performance and some for reducing the footprint.
Dealing with metadata
Why is metadata such a vital component of the virtual file system? In a virtual file system, metadata are pieces of information that describe the structure of the file system. For example, one metadata file can contain information about what files and folders are contained in a single folder in the file system. That means that we will have one metadata file for each folder in our virtual file system. As the virtual file system grows, we will get more and more metadata files.
Though metadata can be stored centrally and may be fine for smaller set-ups, here we are talking about scale-out. So, let’s look at where not to store metadata. Storing metadata in a single server can cause poor scalability, poor performance and poor availability. Since our storage layer is based on an object store, a better place to store all our metadata is in the object store – particularly when we are talking about high quantities of metadata. This will ensure good scalability, good performance and good availability.
Meeting the need for speed
Performance can be an issue with software-defined storage solutions, so they need caching devices to increase performance. From a storage solution perspective, both speed and size matter – as well as price; finding the sweet spot is important. For an SDS solution, it is also important to protect the data at a higher level by replicating the data to another node before destaging the data to the storage layer.
As the storage solution grows in both capacity and features, particularly in virtual or cloud environments, supporting multiple file systems and domains becomes more important. Supporting multiple file systems is also very important. Different applications and use cases prefer different protocols. And sometimes, it is also necessary to be able to access the same data across different protocols.
NAS, VMs and SDS
Hypervisors will need support in the hybrid cloud. Therefore, the scale-out NAS needs to be able to run as hyper-converged as well. Being software-defined makes sense here.
When there are no external storage systems, the scale-out NAS must be able to run as a virtual machine and make use of the hypervisor host’s physical resources. The guest virtual machines (VMs) own images and data will be stored in the virtual file system that the scale-out NAS provides. The guest VMs can use this file system to share files between them, making it perfect for VDI environments as well.
Now, why is it important to support many protocols? Well, in a virtual environment, there are many different applications running, having different needs for protocols. By supporting many protocols, we keep the architecture flat, and we have the ability to share data between applications that speak different protocols, to some extent.
What we end up with is a very flexible and useful storage solution. It is software-defined, supports both fast and energy-efficient hardware, has an architecture that allows us to start small and scale up, supports bare-metal as well as virtual environments, and has support for all major protocols.
Sharing the file system
Because there are multiple sites, each of them will have its own independent file system. A likely scenario is that different offices have a need for both a private area and an area that they share with other branches. So only parts of the file system will be shared with others.
Using one part of the file system for others to mount at any given point in the other file systems provides the flexibility needed to scale the file system outside the four walls of the office – making sure that the synchronisation is made at the file system level in order to have a consistent view of the file system across sites. Being able to specify different file encodings at different sites is useful, for example, if one site is used as a backup target.
Flexible, scale-out storage
When all of the above considerations are implemented, they create a next-generation hybrid cloud system. One file system spans all servers so that there are multiple points of entry to prevent performance bottlenecks. The solution offers flash support for high performance and native support of protocols. Scale-out is flexible; just add a node. The solution is clean and efficient, enabling linear scaling up to exabytes of data. It is an agile, cost-efficient approach to data centre expansion.