Consuming public cloud services on-premise: A guide

As we enter our second decade of its existence, the role of the public cloud is changing. Infrastructure as a service altered the way that virtualised resources are consumed, but what has emerged is far more powerful than allocating compute, storage, and networking on-demand.

The derivative services that the public cloud providers now offer include speech-to-text, sentiment analysis, and machine learning functionality that are constantly being improved. While it is often prudent to run an application on virtual machines or a container cluster on-premises for cost, security, or data gravity reasons, this new breed of public cloud services can often be used in a stateless manner that enables them to be utilised no matter where the business logic for an application resides.

How are on-prem applications utilising these services today and how can that usage evolve over time to work at scale?

Common usage today

Today, application code has to be bound to a specific instance of a public cloud service in order for the interaction between the code and the service to work correctly. Typically, that binding involves standing up an instance of the service using the public cloud console, granting access to a particular user with a particular set of security authorisations, making access keys for that user available to the developer, who then has to embed references to both the access keys and the service instance those keys grant access to.

Here’s an example of that from the developer perspective using a Kubernetes-based application on-prem to connect to the Google Natural Language API. First, the deployment.yaml file that describes how the front end component of our application should be deployed:

The key portion for this discussion is at the bottom where a volume is mounted so that the launched containers can access the local disk that contains the access keys and both the access keys location (GOOGLE_APPLICATION_CREDENTIALS) and project ID for pointing to the correct instance of the service (GOOGLE_PROJECT_ID, where the value is blurred) are injected into the container as environment variables.

In the front-end Python code put into this container, first it must create an instance of the natural language object that is part of the Google Python client library:

Here, there is a specific reference being made to that project ID and the client library is smart enough to look for the access keys location in the aforementioned environment variable. At this point, the client library can be used to do things like measure the sentiment of an input string:

Needless to say, this process is both cumbersome and fragile. What if the volume breaks and the code cannot get to the access keys? How about typos of the project IDs? What happens if you want to change either one?

This is complicated in aggregate across an application portfolio that would otherwise have to do this individually for every public cloud service. Hard-coding project IDs is subject to human error and rotating access keys – to ensure better security of the public cloud service consumption – forces a new deployment. Usage metrics are locked inside the individual accounts from which the project IDs are generated, making it difficult for anyone to get a real sense of public cloud service usage across multiple applications.

A better future

What is a better way to tackle this problem so that developers can create applications that get deployed on-prem, but can still take advantage of public cloud services that would be difficult to replicate? Catalog and brokering tools are emerging that remove many of the steps described above by consolidating public cloud service access into a single interface that is orthogonal to the developer view of the world. Instead of a developer baking in key access and project IDs into the deployment process, the IT ops staff is able to provide a container cluster environment that injects the necessary information. This simplifies deployments for the developer and provides a single place to collect aggregate metrics.

For example, here is a screenshot from a catalog tool where an IT ops admin can create an instance of a pub/sub service (left), before creating a binding (right) for that service to be used by an individual application:

The code required to complete the binding is simpler than the previous example (shown in Node.js):

By removing the need to inject binding-necessary information during the deployment process and instead having it handled by the environment itself, public cloud services can be reused by providing multiple application bindings to the same service. Access keys can be rotated in-memory so that security can be improved without forcing a deployment. Usage flows through a single point, making metrics collection much easier.

In summary

Certain public cloud services, especially those involving large AI datasets like natural language processing or image analysis, are difficult if not impossible to replicate on-prem. Increasingly, though, users expect applications to contain features based on these services. The trick for any developer or enterprise is to find a way to streamline access to these services across an application portfolio in a way that makes the individual applications more secure, more resilient, and provide more useful usage metrics.

Current techniques of binding applications to these public cloud services prevent this – but a set of catalog and brokering tools are emerging that make this far easier to deliver on these promises that customers demand.