We live in an environment where everything is changing. Business requirements are changing. User demands are constantly in flux and always evolving. And our infrastructure is also continually changing. Frankly, the infrastructure has always been in a constant state of change, but in the past we pretended that we could get it to a point of stability — that we could reach a state of “done.” Once we finished setting up that totally stable infrastructure, then we could run everything on top with no problems, right?
IT is perpetually in firefighting mode because it treats change as the exception, not the rule. Yet, change is the only constant in our world.
The increased use of containers in recent years has come largely out of the value that the container image brought — having a deployable artefact (the Docker image) that bundled together all dependencies, from the operating system through middleware and the application components, enabled significant advancements in development and operational (DevOps) efficiencies. And the speed with which containers could be launched helped to expand and refine practices around infrastructure as code and immutable infrastructure. But containers alone do not address the need for constant adaptation.
Just like the infrastructure virtualisation that was ushered in by VMware 20 years ago and delivered as a service starting with AWS, the introduction and early adoption of containers has left so much of the way IT works largely unchanged. The use of automation has increased the very infrastructure as code and also resulted in the automation of existing practices — a script to install the docker runtime on three hosts, another to “docker run” three different microservice images, and another to adjust firewall rules to allow traffic through.
This automation still assumes a level of stability; after running the scripts we are “done” and things will just keep humming along. But when, for example, two of the docker hosts are suddenly unavailable, the team is once again in firefighting mode.
Enter container orchestration. The most popular container orchestration system in the industry today is Kubernetes, and with good reason. What makes Kubernetes and other similar systems really shine, is that the system operates in a mode that anticipates constant change.
The Kubernetes model is so effective because it allows a user to say “here’s my desired state. I want 2 instances of my user-facing web page, 3 instances of my catalogue service and 10 instances of my shopping cart service” and Kubernetes just makes it so. It is a declarative model for defining complex systems. Kubernetes constantly monitors the actual state of the system and any time it differs from the desired state it’ll remediate. Kubernetes has change-tolerance built into its DNA.
Another thing that taxes an IT team is the variability they have in their infrastructure. There are different server and storage platforms and an arguably even more varied set of networking solutions. Increasingly, enterprises are going hybrid, leveraging a combination of on premise and public cloud infrastructures. This means that not only must IT teams become experts in the management interfaces for many different clouds, the scripts they are writing to automate the myriad of different tasks must be written and maintained for each different infrastructure.
Kubernetes addresses this by providing abstractions over the top of the varied infrastructure assets, allowing Kubernetes consumers to leverage that infrastructure through common entities such as workloads (pods and replica sets), networks and network policies (NetworkPolicy) and storage (Storage Classes, Persistent Volume Claims). Kubernetes is designed to adapt to the infrastructure.
Finally, and perhaps the thing that gets me most excited about Kubernetes, is its extensibility. Out of the box Kubernetes already delivers a whole host of resource types — pods, storage classes, roles and so much more — and functionality to lifecycle manage those resources — replica sets, daemon sets, stateful sets and more, but particularly when it comes to stateful workloads like a database, cache, or indexing services, each one has unique needs. The way that Mongo DB protects data that it stores is quite different from the way that MySQL does, for example. Kubernetes allows for custom resource definitions (CRDs) and associated behaviours (one of the most popular means for this is via operator) to be added, effectively extending the reach of the platform. That is, Kubernetes can be adapted to host and manage a virtually endless set of different types of workloads.
When you look at the abstractions that Kubernetes provides it’s easy to think of it as a new API for infrastructure — its base primitives are compute, storage, and network, just as with server virtualisation. It is its tolerance for change that sets it apart.
Who is Kubernetes for?
Just like Docker and server virtualisation before that, initially Kubernetes has captured the mindshare of the developer. Particularly now that those developers are increasingly responsible for keeping their software running well in production, having an intelligent, autonomous system that helps them with those operational tasks is hugely valuable. App operations involves not only the day 1 task of deployment but also maintenance in the face of infrastructure changes, security vulnerabilities, and more.
Just as enterprise IT provides centralised, secure, compliant, and resilient virtualised infrastructure environments, the time has come for providing secure, compliant, and resilient container platforms.
It’s rare these days that I speak to an enterprise that does not have some, sometimes substantial, presence of container-centric efforts going on. Often it has grown out of a development group that has built its practices around containers. They’re building docker images for their apps but, because the enterprise does not already have a production platform that can run those images, the same app teams are managing the container platform. Just as enterprise IT provides centralised, secure, compliant, and resilient virtualised infrastructure environments, the time has come for providing secure, compliant, and resilient container platforms.
As Kubernetes becomes mission critical
With the capabilities that it brings for running and managing mission-critical workloads, Kubernetes itself must be equally resilient to change. If a security vulnerability is found that requires Kubernetes be upgraded, it must be patched quickly and with zero downtime for the workloads it is hosting.
If application capacity requirements suddenly spike, the Kubernetes capacity must be quickly expanded to meet the need. When the spike has passed, Kubernetes needs to be right-sized again to keep IT infrastructure costs in check.
These are exactly the challenges that Kubernetes is addressing for containerised workloads. The key is to use the same principles and techniques that Kubernetes uses for workloads to manage Kubernetes itself.