Orchestration in the cloud: What is it all about?

Can orchestration be considered a better alternative to provisioning and configuration management, especially in the case of cloud-native applications? We can look at this from a variety of angles; comparing against data centre-oriented solutions; differentiating orchestration of infrastructure (in the cloud and out of the cloud) versus containers (focusing mostly on cloud), as well as looking at best practices under different scenarios.

It’s worth noting here that this topic can span not only a plethora of articles but a plethora of books – but as the great Richard Feynman used to say, it is not only about reading or working through problems, but also to discuss ideas, talk about them, and communicate to others.

I would like to start with my favourite definition of orchestration, found in the Webster dictionary. Orchestration is ‘harmonious organisation’.

Infrastructure or containers?

When discussing orchestration, inevitably, the first question we ask ourselves is: infrastructure orchestration or container orchestration?

These are two separate Goliaths to engage, but undoubtedly we will face them both in the current IT arena. It all depends on the level of abstraction we wish to attain, and also, on how we organise the stack and which layers we want to take care of, or the opposite.

If we have decided to manage at the infrastructure level, we will work with virtual machines and/or bare metal servers – in other words, either a multi-tenant or a single-tenant server. Say we hire our cloud in an IaaS fashion, then we are handed resources such as the aforementioned plus networking resources, storage, load balancer, databases, DNS, and so on. From there, we build our infrastructure as we prefer.

If we have decided to manage at the CaaS (sometimes seen as PaaS) level, we will be managing the lifecycle of containers or, as they are frequently referred to in the literature, workloads. For those unfamiliar with containers, it is a not-so-new way of looking at workloads. Some of the most popular are Docker, Rkt, and LXC. Containers are extremely good to define an immutable architecture, also for microservice definition – not to mention they are lightweight, easily portable, and can be packed to use another day.

There are pros and cons to each of these – but for now, let us proceed in discussing the orchestration aspect on these two endpoints.

Infrastructure

There are several choices to orchestrate infrastructure: here are the two that seem to be among the most popular in companies today.

Provisioning and configuration management: One way of doing this is with the solid old school way of the combo PXe/Kickstart files, although it is slowly being replaced by more automated solutions, and some companies still stick to it, or alternatives such as Cobbler. On the other side, we use tools such as Foreman. Foreman has support for BIOS and UEFI across different operating systems, and it integrates with configuration management tools such as Puppet and Chef. Foreman shines in data centre provisioning and leaves us with an easy to manage infrastructure ready to be used or config managed even more.

Once the provision aspect is complete, we move onto configuration management, which will allow for the management throughout the lifecycle. There are many flavours: Ansible, Chef, Puppet, Salt, even the old and reliable CFengine. The last two are my favourites; even Ansible, a Swiss army knife that helped me many times, given the simplicity and master-less way of work.

Orchestration and optional configuration management: Now, orchestration implies conceptually something different – as mentioned before, harmonious organisation – and the tool that is frequently used nowadays is Terraform. On the upside, it allows to orchestrate in a data centre or in the cloud, integrating with different clouds such as AWS, Oracle Cloud, Azure, and even AliCloud. Terraform has many providers and sometimes the flexibility of the resource management lies in the underlying layer. Besides the cloud providers, it is also possible to integrate Terraform with third parties such as PagerDuty and handle all types of resources. From first hand experience, that sort of integration was smooth and simple, although granted, sometimes not mature enough.

Not all providers will yield the same flexibility. When I started to work with Terraform in Oracle Cloud, OCI did not have the maturity to do auto-scaling; hence, the provider was not allowing Terraform to create autoscaling groups, sometimes so vital that I took it for granted due to working with Terraform and AWS in the past. So another tip is to take a look at the capabilities of the provider, whether cloud or anything else. Sometimes our tools simply do not integrate well with each other, and to design a proper architecture, that is an aspect which cannot be taken lightly.

Another plus of Terraform is that it allows to orchestrate any piece of infrastructure, not only compute machines; it goes from virtual machines, bare metal and such, into networking resources and storage resources. Again, it will depend on the cloud and the Terraform provider and plugins used.

What makes Terraform new generation tools is not only the orchestration, but the infrastructure as a code (IaaC) aspect. The industry steered towards IaaC everywhere, and Terraform is no exception. We are allowed to store our resource definitions in files in any VCS system, Git, SVN, or any other, and that is massive: it allows us to have a versioned infrastructure, teams can interact and everybody is up to speed, and it is possible to manage branches and define different releases, separating versions of infrastructure and environment such as production, staging, UAT, and so on. This is now considered a must: it is not wishful thinking, but the best practice way of doing it.

Once the initial steps with Terraform are done, the provisioning can be completed with something such as Cloud-Init, although any bootstrapping will do. A popular alternative here seems to be Ansible: I have used it and as stated previously, it is a Swiss army knife for small, simple initial tasks. If we are starting to work on cloud, Cloud-Init will fit the bill. After that, other configuration management tools can take over.

That being said, I am adept to immutable infrastructure, so I limit configuration management to the minimum. My thoughts are that in the future, configuration management tools will not be needed. If and when something fails, it should be destroyed and re-instantiated. System administrators must not know the name of resources and only SSH into them as a last resort – if ever.

Container orchestration

Containers are not a new thing anymore; they have been around for a few years (or decades depending on how we look at it), they are stable enough and useful enough that we may choose them for our platform.

Although containers in a data centre is fun, containers in the cloud is amazing, especially because most clouds nowadays provide us with container orchestration, plus a plethora of solutions exist in case we cannot get enough. Some examples include ECS, Amazon Container Service; ACS, Azure Container Services; CoreOS Fleet; Docker Swarm; GCE, Google Container Engine; Kubernetes, and others.

Although I have left Kubernetes last, it has taken the spotlight. There are three reasons this tool has a future:

  • It was designed by Google and that has merit on its own, due to the humongous environment in which it was used and was able to thrive
  • It is the selected one from the Cloud Native Computing Foundation (CNCF), and that means it has bigger chances to stay afloat. The CNCF is very important for cloud-native applications and it it supported by many companies (such as Oracle)
  • The architecture is simple and easy to learn, can be deployed rapidly, and scaled easily

Kubernetes is a very promising tool that is already delivering results. If you are thinking about container orchestration at scale, starting to delve into something such as Minikube and slowly progressing to easy-to-use tools such as Rancher will significantly help to pave the road ahead.

Conclusion

There are many solutions, as has been shown, depending on what sort of infrastructure is being managed; also where the infrastructure is located, the scale, and how it is currently being distributed.

Technologies can also be used jointly. Before Oracle Cloud had OKE (Oracle Kubernetes Engine), the way we implemented Kubernetes in the cloud was through a Terraform plugin that instantiated the necessary infrastructure, and then deployed the Kubernetes cluster on top of it for us to continue configuring, managing, and installing applications such as ElasticSearch on top of it.

The industry is moving towards cloud, and that new paradigm means everything to be delivered as XaaS (everything as a service). This in turn means that building distributed architectures, reliable, performant, scalable and at a lower cost will be, and for some companies already is, a huge competitive advantage.

Nonetheless, there are many technologies to choose from. Often, aligning with the industry standard is a smart decision. It means it is proven, used by companies, in current development, and will be maintained for years ahead.